1 Introduction

A power grid is a complex technical system; hence, it is prone to technical disturbances. Even direct losses from infrastructure damage and blackouts are enormous, so, much attention is paid to assuring system sustainability under probable external and internal shocks. Controlled islanding is the process of splitting an interconnected power grid into smaller electrically independent parts. It is used as a last-resort effort to cope with many technical disturbances including undamped oscillations, voltage collapse, cascading trips, etc. [2, 41].

The rationale behind the process of controlled islanding is that a smaller grid is easier to stabilize: islands have not be synchronized, low frequency oscillations are less likely to occur in a small grid, and so on. Also, the islanding operation can isolate ill parts of the system from healthy ones, and the blackout will be localized at ill islands if not avoided.

So, controlled islanding is a last chance to keep the grid alive (at least, partially), which is a desirable effect. But there are also some side effects.

First of all, controlled islanding requires a series of complex actions performed with high accuracy and coordination. Any error can cause a cascading trip of generators or transmission lines.

Secondly, some power transmission lines are switched off during the islanding operation, which makes a great shock to the grid even when islanding is accurately planned and perfectly implemented. Such disruption leaves a partitioned grid in a highly unstable state, making it questionable to stabilize the state of some islands. The simplest metrics of the power flow disruption is the total volume of power flows broken during an islanding operation.

Thirdly, a well-designed interconnected grid has more opportunities to serve the current demand than any partitioned grid due to limited power transmission and ramp rate opportunities of the latter. Therefore, some load shedding is an essential part of the controlled islanding process [29].

After all, restoring the grid after the controlled islanding is also a time-consuming and complex operation.

Bearing in mind high risks of controlled islanding, it is important to plan properly the islanding operation and to suggest a realistic and safe grid partitioning scheme that will stabilize the grid and minimize side effects.

Optimal islanding of a real-world power system is a high-dimensional optimization problem. The islanding decision is made under the extremal time pressure; a system operator has just several seconds to develop and implement an islanding scheme. Therefore, finding an optimal grid partition becomes a non-trivial task.

Metrics of system stability considered by power system security studies are computationally expensive. Shocks incurred by the islanding operation must be classified as large-scale, and non-linear effects cannot be neglected. Hence, to make reliable predictions of the after-islanding system dynamics extensive time-domain simulations are performed under the paradigm of the transient stability analysis [30].

That is why most formal models of optimal controlled islanding (OCI) do not consider directly restoring system stability as an optimization criterion. Workarounds include incorporating external constraints (e.g., limited island volume) or constructing some simpler criteria, e.g., degree of generators’ coherency (or dynamic coupling) in the pre-islanding system state. Many articles leave stability constraints behind the scene (e.g., [16, 38, 39, 43] and many others) concentrating on minimization of side effects of the islanding process, and most of them take into account only a single aspect (load shedding in [20, 35]) or a couple of aspects (e.g., generator coherency and flow disruption in [16]).

The mathematical model of OCI introduced in the present article takes into account multiple aspects of the islanding process and their corresponding performance metrics (generator coherency [8], flow disruption [32], load shedding, and, optionally, line susceptance). A fast and efficient two-step algorithm is proposed to calculate a rational scheme of partitioning a grid into the desired number of islands K with the limited maximum island volume.

In the first step hierarchical spectral clustering algorithms from [37, 39] are used to break the grid into \(n'>K\) islands to minimize the weighted combination of coherency and flow disruption metrics.

In the second step each of detailed grid partitions obtained in the first step is transformed into the aggregated grid with \(n'\) vertices. The aggregated grid is then partitioned into K connected islands with the limited maximum island volume by an exact algorithm implemented in CPLEX 12. A greedy heuristics provides an efficient starting solution, which sufficiently fosters calculations. In addition to generator coherence and flow disruption the optimization criterion in the second step also takes into account the power imbalance inside islands.

The idea is to combine high speed (as the problem dimension is efficiently reduced in the first step) and high flexibility (complex optimization criteria and constraints are allowed in the second step). Computational experiments on the models of standard power systems with 118, 2383, and 9241 nodes show that the proposed algorithm is fast enough and outperforms alternative approaches both in bulk and in the value of every single performance metric.

The rest of the article has the following structure. In Sect. 2 recent approaches to OCI are surveyed. Then in Sect. 3 we introduce the notation and basic mathematical concepts used to define controlled islanding performance metrics. In Sect. 4 essential information is provided about spectral clustering, which is the main tool for fast and efficient islanding. The two-step algorithm for multi-objective grid partitioning is introduced in Sect. 5, while computational experiments on three power systems are presented in Sect. 6. Section 7 concludes with some open issues and perspectives.

2 Literature review

After a critical breakdown (e.g., a shortage, a circuit trip, or a generator failure) a power grid may become instable. Different groups of generator go out of sync, and the main goal of the automated grid control is to return the system into the stable state with minimum load shedding. In [2, 44] it is shown that controlled islanding (accompanied with appropriate load shedding) can be a promising strategy to prevent cascading blackouts in power systems. A sort of modal analysis (analysis of normal forms) was employed in [44] to perform generators’ grouping while in [2] predefined islands were considered. At the same time, the choice of a rational grid partitioning strategy was a challenging discrete optimization problem. System stability and the amount of load shedding were considered as the main criteria of partition quality.

Load shed is the amount of load that cannot be served safely given the topology of islands in the grid according to voltage and safety constraints and, thus, should be disconnected during the islanding operation. Load shedding can be obtained by solving the optimal load shedding (OLS) problem under the alternating current (AC) model. AC–OLS reduces to a non-linear optimization problem [35], which takes sufficient time to solve. A simpler version of OLS problem often used in contingency analysis is the direct current (DC) approximation, which reduces to the linear program with (real) power flow balance, phase angle and maximum real flow constraints [35].

Even in a small power grid the number of alternative islanding schemes is enormous. An operator has just few seconds to choose and implement a grid islanding scheme (e.g., some generators go out of step within 5 sec after the contingency in the scenario modeled in [48]). A detailed time-domain simulation cannot be run for every alternative partition to verify its stability, and indirect stability indicators are typically used at the partition selection stage (a notable exception is the use of PSSENG time-domain simulation in the genetic algorithm [19] to evaluate island stability of IEEE 118-bus scheme).

A popular stability indicator of an electrical power system is coherency of its generators [50] (their aspiration to swing together). Generator grouping methods based on slow coherency detection were primarily developed for system model reduction [8, 10, 11]. Several approaches were proposed (see [9, 51]) to construct complete grid partitions by assigning loads to generator groups, but they ignored load shedding and the other metrics relevant to controlled islanding (e.g., transmission line security constraints).

In  [41, 52] controlled islanding is considered as a satisfiability problem (the problem of finding a grid partition that fulfills a set of constraints). Ordered binary decision diagrams (OBDD) were used for limited enumeration of islanding schemes that satisfy generator coherency and load/generation balance constraints, while DC–OLS is run for each candidate solution until stability constraints are verified. OBDD-based enumeration is time-demanding, so large real-world networks have to be aggregated before applying the algorithm.

Load shedding is minimized in [35] by solving a series of DC–OLS problems inside a greedy algorithm. Stochastic programming is used in [24, 25] to find an islanding scheme, which minimizes the average load shedding against a series of pre-defined contingencies. In [18, 20, 42, 43] the grid partitioning problem is reduced to the mixed-integer linear program (MILP) with additional generator coherency constraints in [18, 42, 43] and connectivity constraints in [18]. MILP is solved with exact algorithms implemented in CPLEX numeric optimization package. Nevertheless, computational experiments on grids with \(\le \)300 nodes show that these algorithms are time expensive and, hence, hardly applicable to online islanding problems.

To improve time efficiency of algorithms the amount of load shedding is often approximated by the load/generation imbalance. In particular, a heuristic algorithm is proposed in [46] to minimize total load/generation imbalance under generator coherency constraints, but its time and cost efficiency was not compared to alternative approaches.

In addition to generator coherency, the power flow disruption is another popular indicator of the island (in)stability [32, 36]. When a transmission line is tripped during an islanding operation, the power flow through this line immediately drops to zero. The greater is the disruption (total power flow through the lines being teared), the greater is the excessive shock to the islanded power system (in addition to the initial disturbance) and the less possible is its stabilization.

Algorithms of controlled islanding based on spectral clustering techniques are intensively studied in recent years. The idea of using spectral clustering techniques for fast partitioning of electrical networks can be traced back to [32, 49]. A Spectral Clustering Controlled Islanding (SCCI) scheme is proposed in [16]. It accounts both for generator coherency and power flow disruption. In the first step the normalized spectral bisection is used to divide the generators in two coherent groups. In the second step loads are assigned to generator groups using the unnormalized constrained spectral bisection to minimize the flow disruption. Recursive bisection is applied to obtain the desired number of islands. SCCI algorithm is shown to be much faster than OBDD-based enumeration techniques [41, 52] with the minor loss in solution quality.

Later this methodology was extended in different aspects. In particular, k-medoids algorithm is used to cluster eigenvector points [17] instead of k-means suggested in [32]. The problem of outliers when performing eigenvector analysis is addressed in [17]. Recursive bisection was replaced by the direct k-way partitioning in [17, 39], which decreased computation cost and increased partition quality.

In [39] it is proposed to incorporate hierarchical clustering algorithm (first introduced in [47]) into the spectral clustering scheme to stimulate generation of connected islands. In [37] this approach is extended to account for generator coherency. Predefined generator groups are considered and loads are assigned to the nearest neighboring generator in the same spectral graph embedding. Finally, in [38] the spectral clustering approach is applied to the problem of parallel system restoration, which differs from controlled islanding in several important aspects.

An essential limitation of OCI algorithms based on spectral clustering is their disregard of load shedding. A yet another serious gap is that spectral methods sometimes result in extremely imbalanced partitions (those consisting of a huge mainland and several tiny islets), which is not practical in many ways.

The present article is devoted to further development of the spectral approach to OCI. We overcome existing shortcomings by using the hierarchical spectral clustering algorithms from [37, 39] as pre-processing routines to decrease the problem dimensionality to the desired degree caring only for generator coherency and flow disruption. Island volumes and load shedding are accounted for in the second step of the proposed algorithm, where mixed-integer quadratic problem (MIQP) is solved.

3 Basic notation

Every islanding decision is unique, since it is made under unique (and often unexpected) conditions. The state of the power system at the moment before the islanding decision is described by several groups of variables. Table 1 summarizes the notation used to define the context of an islanding process and its performance metrics.

Let us consider a power grid consisting of n nodes (so called, buses) indexed from 1 to n, and m transmission lines. Without loss of generality assume that at most one generator or a load is assigned to each bus and all generators are located at the first \(n_g\le n\) buses. Bold is used for vectors.

Table 1 Nomenclature

Let \(\mathbf {1}_n\) be the n-dimensional all-ones vector, \(\mathbf {0}_n\) be the n-dimensional all-zeros vector, and let \(I_{n\times n}={{\mathrm{diag}}}(\mathbf {1}_n)\) be the \(n\times n\) identity matrix. For real symmetric \(n\times n\) matrix \(A=(a_{ij})_{i,j=1}^n\) its Laplace matrix is defined as

$$\begin{aligned} L(A):={{\mathrm{diag}}}(A\cdot \mathbf {1}_n)-A. \end{aligned}$$

If n-dimensional positive vector of volumes\(\mathbf {w}\) is also given (see Table 1 for possible definitions of \(\mathbf {w}\)), the symmetric normalized Laplace matrix of matrix A under volumes \(\mathbf {w}\) is defined as

$$\begin{aligned} L_{sym}(A|\mathbf {w}):={{\mathrm{diag}}}(\mathbf {w})^{-\frac{1}{2}}\cdot L(A) \cdot {{\mathrm{diag}}}(\mathbf {w})^{-\frac{1}{2}}. \end{aligned}$$

Power flows in a power grid are naturally modeled by a directed graph with weights assigned to its vertices (representing injections in buses) and to arcs (representing real power flows through transmission lines). Let us introduce the notion of the graph cut, which plays the central role throughout the article. Consider a simple directed graph with vertex set N and arc weights \(a_{ij}\), \(i,j \in N\). For vertex set \(s\subseteq N\) the cutset is the minimum set of arcs needed to be removed to isolate s from the rest of the graph. The total weight of the cutset is called the cut, which is calculated as

$$\begin{aligned} Cut_A(s):=\sum _{i\in s,j\in N\backslash s}a_{ij}=x^T L(A) x, \end{aligned}$$

where \(A:=(a_{ij})_{i,j=1}^n\) is the matrix of arc weights, and x is an indicatory vector of vertex set s (i.e., \(x_i=1\) if vertex \(i\in s\), and \(x_i=0\) otherwise).

An islanding scheme with K islands is represented by partition \(\pi =(s_1,\ldots ,s_K)\) of vertex set N into K disjoint parts. For partition \(\pi =(s_1,\ldots ,s_K)\) the cut is defined as

$$\begin{aligned} Cut_A(\pi ):=\sum _{k=1}^K Cut_A(s_k)= {{\mathrm{tr}}}X^T L(A)X, \end{aligned}$$

where X is \(n\times K\) indicatory matrix of partition \(\pi \) (i.e., \(x_{ik}\) is equal to unity when \(i\in s_k\) and is zero otherwise).

For positive vector \(\mathbf {w}=(w_i)_{i=1}^n\) of vertex volumes the normalized cut is defined as

$$\begin{aligned} NCut_A(\pi |\mathbf {w}):=\sum _{k=1}^K \frac{Cut_A(s_k)}{w(s_k)}= {{\mathrm{tr}}}Z^T L_{sym}(A|\mathbf {w})Z, \end{aligned}$$
(1)

where \(w(s):=\sum _{i\in s} w_i\) is the volume of island \(s\subseteq N\), \(Z = {{\mathrm{diag}}}(\mathbf {w})^{\frac{1}{2}} X {{\mathrm{diag}}}(X^T\mathbf {w})^{-\frac{1}{2}}\) (in the other words, \(z_{ik} = \sqrt{\frac{w_i}{w(s_k)}}\) if \(i \in s_k\) and is zero otherwise.) Matrix Z is orthogonal, i.e., \(Z^TZ=I\).

Many popular performance metrics of controlled islanding can be written in terms of a graph cut with appropriately weighted graph arcs. For any K-partition \(\pi =(s_1,\ldots ,s_K)\)generators’ dynamic coupling is defined in [16] as

$$\begin{aligned} C(\pi )=Cut_\Phi (\pi ), \end{aligned}$$
(2)

and power flow disruption is defined as

$$\begin{aligned} D(\pi )=Cut_{|P|}(\pi ). \end{aligned}$$
(3)

Here the absolute power flow matrix (nonnegative and symmetric) is defined as \(|P|:=(\frac{1}{2}|p_{ij}|+\frac{1}{2}|p_{ji}|)_{i,j=1}^n\), where \(p_{ij}\) is the real power flow along arc ij.

In the present article we follow [16] and use dynamic coupling matrix \(\Phi \) (defined in Table 1) to build metric (2) of generator coherence. Alternatively, more sophisticated approaches (e.g., independent component analysis [4] or hierarchical trajectory cluster analysis [28]) can be employed to build another matrix of generators’ coherence, which can be used in \(C(\pi )\).

The electrical cohesiveness index (ECI), defined as

$$\begin{aligned} ECI(\pi ):=Cut_\Delta (\pi ), \end{aligned}$$

was used in [13] to split the grid on the basis of electrical distance matrix \(\Delta \).

The load shed is the amount of load that cannot be served safely given the topology of islands in the grid according to voltage and safety constraints. For partition \(\pi =(s_1, \ldots , s_K)\) the load shed is denoted as

$$\begin{aligned} S(\pi ):=\sum _{k=1}^K S(s_k), \end{aligned}$$

where \(S(s_k)\) is the load shed in island \(s_k\). Several approaches with different accuracy and computational complexity are used to evaluate \(S(s_k)\). Let \(S_{AC}(s_k)\) be the amount of load shedding in island \(s_k\) in the solution of AC–OLS. Since AC–OLS is computationally expensive, below we calculate it only for final partitions to verify the quality of the solution.

DC–OLS is a simpler version of OLS problem limited to real flows with fixed bus voltages, zero line losses, and low phase angle differences. DC–OLS reduces to the linear program with real power flow balance, phase angle and maximum real flow constraints [35]. The load shed in island \(s_k\) calculated from OLS–DC solution is denoted by \(S_{DC}(s_k)\).

The estimate of load shedding amount calculated as a solution of the maximum flow problem for island \(s_k\) obtained by relaxing phase angle constraints in DC–OLS, is denoted as \(S_{MF}(s_k)\) (see more details in Sect. 5.4 below). Finally, when maximum real flow constraints are relaxed, the minimum amount of load shedding is estimated by excess load:

$$\begin{aligned} S_{EL}(s_k):=\max \left[ p(s_k); 0\right] , \end{aligned}$$
(4)

where \(p(s):=\sum _{i\in s} (d_i-g_i)\) is total imbalance between load \(d_i\) and generation \(g_i\) in nodes \(i\in s\) of island \(s\subseteq N\). If losses in lines are neglected, then \(p(s_k)=Cut_P(s_k)\), where \(P=(p_{ij})_{i,j=1}^n\) is a matrix of real power flows, and excess load is also expressed using the (directed) graph cut.

Finally, we note that \(S_{EL}(s) \le S_{MF}(s) \le S_{DC}(s) \le S_{AC}(s)\) for any island \(s\subseteq N\), so AC–OLS gives the most conservative estimate of the load shed.

4 Spectral clustering basics

Spectral clustering is an approach to finding approximate solutions of minimum cut problems for weighted undirected graphs. Consider an undirected simple graph \(\langle N, E\rangle \) with vertex set \(N=\{1,\ldots ,n\}\), edge set \(E\subseteq N\times N\), and non-negative edge weights \(a_{ij}\), \(ij\in E\). Then, given matrix \(A=(a_{ij})_{i,j=1}^n\) (non-negative and symmetric), the graph minimum K-cut (or K-partition) problem is to find a partition \(\pi =(s_1,\ldots ,s_K)\) of a vertex set N into K disjoint parts that minimizes \(Cut_A(\pi )\).Footnote 1

If, in addition, positive volume \(w_i\) is assigned to every vertex \(i\in N\), the minimum balanced K-cut problem is to minimize \(Cut_A(\pi )\) by choosing a partition \(\pi =(s_1,\ldots ,s_K)\), such that \(w(s_k)\le W, k=1,\ldots ,K\), where \(W \ge W(N)/K\) is an upper cluster volume limit.

The minimum K-cut problem is known [26] to be solvable in polynomial time \(O(n^{K^2})\) for any fixed K, but is NP-complete [23] when K is not limited. The minimum balanced cut problem is NP-complete [23] even for the graph bipartition problem (when \(K=2\), n is even, \(w_i=1\), and \(W = n/2\)).

Many efficient approximate graph partitioning algorithms were developed since then, with spectral clustering being among most popular ones. For spectral clustering the balanced K-cut problem is replaced with \(NCut_A(\pi |\mathbf {w})\) minimization problem with no cluster volume constraints resulting is some sort of relaxation: for partition \(\pi = (s_1, \ldots , s_K)\)

$$\begin{aligned} NCut_A(\pi |\mathbf {w}) = \frac{Cut_A(s_1)}{w(s_1)}+\cdots +\frac{Cut_A(s_K)}{w(s_K)}, \end{aligned}$$
(5)

and denominators in (5) penalize small cluster volumes, while for equal cluster volumes (such that \(w(s_1)=\cdots =w(s_K)\)) we have

$$\begin{aligned} NCut_A(\pi |\mathbf {w})= Cut_A(\pi )\cdot \frac{K}{w(N)}. \end{aligned}$$

Spectral clustering is based on the spectral lower bound for the trace minimization problem [33]. For an arbitrary real symmetric \(n\times n\) matrix A let \(\lambda _i(A)\), \(i=1,\ldots ,n\), be its eigenvalues enumerated in ascending order, and let \(\mathbf {u}_i(A)\) be corresponding eigenvectors. Then for any \(n\times K\) orthogonal matrix Z, \(K\le n\), the following inequality holds:

$$\begin{aligned} {{\mathrm{tr}}}Z^T A Z \ge \lambda _1(A) + \cdots + \lambda _K(A) = {{\mathrm{tr}}}U^T A U, \end{aligned}$$
(6)

where \(U = (\mathbf {u}_1(A), \ldots , \mathbf {u}_K(A))\) is a matrix composed of K first (normalized) eigenvectors of matrix A.

Any K-partition can be written in the form of an orthogonal \(n\times K\) binary matrix \(X = (x_{ik})\), where \(x_{ik} = 1\) if an only if i-th node belongs to k-th cluster. Spectral clustering algorithms approximate expression \({{\mathrm{tr}}}U^T A U\) in (6) with some admissible partition X, which is close in some sense to matrix U. The convenient notation is developed below for basic algorithms of spectral clustering, which is used in the next section to build combined partitioning algorithms.

According to Eq. (1), the normalized cut is written with normalized Laplace matrix \(L_{sym}(A|\mathbf {w})\), so in the classical algorithm [34] the rows of matrix U, whose columns are the first K eigenvectors of \(L_{sym}(A|\mathbf {w})\), are normalized to the unit Euclidian norm and then partitioned with k-means clustering. Each row corresponds to a graph vertex. Denote the resulting K-partition with \(\kappa _K(A|\mathbf {w})\).

The algorithm of k-means is known to be sensitive to outliers (arising, for example, when graph has pendent vertices). It is shown in [15] that more stable islanding schemes for real power grids can be obtained when k-means is replaced with the more robust k-medoids algorithm. The corresponding partition is denoted with \(\mu _K(A|\mathbf {w})\).

Both k-means and k-medoids often suggest disconnected subgraphs as partition elements. To deal with this problem it is suggested in [39] to use hierarchical clustering instead. In the Hierarchical Spectral Clustering (HSC) algorithm [39] the graph is pre-processed by iteratively merging pendent vertices to their neighbors. No pendent vertex is left in the graph to avoid the outlier problem. Then eigenvector matrix U is calculated for the normalized Laplacian of the absolute power flow matrix |P|. Normalized rows of matrix U are considered as points on a K-dimensional sphere and the graph being partitioned is embedded onto this sphere. The distance between any pair of graph vertices is calculated as the length of the shortest path in the embedded graph (the cosine distance between incident nodes is considered). A hierarchical clustering algorithm [47] is then applied to the obtained distance matrix (Standard Matlab implementation with complete linkage [14] is used below). The shortcoming of this algorithm is bigger variation of island volumes: a partition often consists of one or two big islands and a collection of small islets. Let \(\chi _K(|P| ~ |\mathbf {w})\) stand for the K-partition built with hierarchical normalized spectral clustering.

Let us consider a simplistic numeric example by partitioning a tiny power grid (the IEEE 9-bus system shown in Fig. 1) by HSC algorithm. The network has 3 generators installed, 3 loads and the total of 9 buses. The (outbound) real power flows, current real power generations and loads calculated using AC-OPF are presented in Fig. 1 with arrows showing the flow direction. To calculate a bisection of this network with HSC algorithm, pendent nodes 1, 2, and 3 are joined, respectively to nodes 4, 7, and 9. The matrix of absolute power flows for the resulting 6-nodal graph is

$$\begin{aligned} |P|=\left( \begin{array}{cccccc} 0 &{}\quad 27.15&{}\quad 17.60&{}\quad 0 &{}\quad 0 &{}\quad 0 \\ 27.15&{}\quad 0 &{}\quad 0 &{}\quad 36.05&{}\quad 0 &{}\quad 0 \\ 17.60&{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 27.45 \\ 0 &{}\quad 36.05&{}\quad 0 &{}\quad 0 &{}\quad 30.95&{}\quad 0 \\ 0 &{} \quad 0 &{}\quad 0 &{}\quad 30.95&{} \quad 0 &{} \quad 19.10 \\ 0 &{}\quad 0 &{}\quad 27.45&{} \quad 0 &{}\quad 19.10&{}\quad 0 \\ \end{array} \right) , \end{aligned}$$

and matrix

$$\begin{aligned} U=\left( \begin{array}{cc} -0.4603 &{}\quad -0.0075\\ -0.2665 &{}\quad 0.3019\\ -0.4708 &{} \quad 0.7119\\ -0.2808 &{}\quad -0.0240\\ -0.5631 &{}\quad -0.5882\\ -0.3155 &{} \quad -0.2355\end{array}\right) \end{aligned}$$

stores the two smallest eigenvectors of its normalized Laplacian.

Each diamond or cross on the unit circle in Fig. 2a represents some row of matrix U additionally normalized to the unit norm. Each point corresponds to a graph node (their numbers are listed in the figure with generator nodes marked with the star), and connecting the points with graph edges (bold curves in the figure) we obtain the spectral graph embedding. Every edge is labelled with a weight being equal to the distance on the circle between its ends. The dendrogram of the hierarchical clustering applied to the weighted distance matrix of this graph is also shown in Fig. 2a. The root of the dendrogram is located at zero, and the smaller dashed circle in Fig. 2a shows the level, at which exactly two clusters are separated. The nodes of the first cluster are denoted with diamonds, while the nodes of the second one are denoted with crosses. It is worth noting that since we use graph distances, node \(1,4^*\) is closer to node 5 than to node 8 (which can also be seen at the dendrogram).

The resulting islanding scheme is presented in Fig. 2b, which shows teared lines and equilibrium post-islanding generations, power flows, and loads (the shares of demand served are shown in frames). Islanding performance metrics are also presented: generator coherence C, disruption D and the amount of load shedding S (according to AC–OPF model).

Fig. 1
figure 1

Example of pre-islanded network and power flows

Fig. 2
figure 2

An illustration of HSC algorithm

In many applications it is important to keep some pairs of vertices in different clusters (e.g., generators with low dynamic coupling) and the other pairs should always be assigned to a single cluster (e.g., highly coupled generators). The algorithm of constrained spectral clustering [6] uses projection matrix technique to generalize the spectral clustering approach to the case of such must-link and cannot-link constraints, but it can be applied only for graph bisection (for \(K=2\)). Constrained spectral clustering is used in [16, 17, 37] to control coherent generator groups. In the first step of SCCI algorithm [16] the dynamic graph with set \(N_g=\{1, \ldots , n_g\}\) of generators and weights’ matrix \(\tilde{\Phi }\) is considered and all generators are divided in two coherent groups using some spectral bisection algorithm. In the second step unnormalized constrained spectral clustering is used to select a bisection of grid graph that minimizes flow disruption and fulfills must-link and cannot-link constraints: generators from the same coherent group must go to one island while those from different groups cannot go to one island.

Therefore, the first two eigenvectors \(\mathbf {u}_1,\mathbf {u}_2\) of the generalized eigenvalue problem

$$\begin{aligned} H^T L(|P|)H \mathbf {u}=\lambda H^T H \mathbf {u} \end{aligned}$$

are calculated, where H is a projection matrix. If, without loss of generality, \(s_1' = \{1, \ldots , n_1\}\), then

$$\begin{aligned} H = \left( \begin{array}{ccc} \mathbf {1}_{n_1} &{} \mathbf {1}_{n_1} &{} \mathbf {0}_{n_1} \\ \mathbf {1}_{n_g-n_1} &{} -\mathbf {1}_{n_g-n_1} &{} \mathbf {0}_{n_g-n_1} \\ \mathbf {1}_{n-n_g} &{} \mathbf {0}_{n-n_g} &{} I_{(n-n_g)\times (n-n_g)} \\ \end{array} \right) . \end{aligned}$$

After that the rows of matrix \(U=(\mathbf {u}_1,\mathbf {u}_2)\) are split in two clusters using the k-medoids algorithm. The same procedure should be applied recursively to the biggest island of the partition until the desired number of partitions is obtained.

Another Constrained Spectral Clustering (CSC) methodology of intentional controlled islanding is proposed in [37]. CSC algorithm starts from predefined groups \(s_1^g, \ldots , s_K^g\subseteq N_g\) of coherent generators (probably, identified with independent component analysis [4] or hierarchical trajectory cluster analysis [28]) and aims at forming islands by distributing loads between these generator groups to minimize the normalized cut of absolute power flow matrix.

Analogously to HSC algorithm, K first eigenvectors of the normalized Laplacian of absolute power flow matrix |P| are used to calculate the graph embedding onto the K-dimensional unit sphere. As in HSC algorithm, the distance between a pair of graph vertices is evaluated as the length of the shortest path in the embedded graph (again, cosine distance between incident vertices is considered). Finally, the K-partition of the graph is built by assigning each vertex to the i-th island if and only if the nearest generator vertex in the embedded graph belongs to group \(s_i^g\). Let us denote with \(\sigma _K(|P|~|\mathbf {w},\pi ^g)\) a K-partition built by CSC algorithm given absolute power flow matrix |P|, partition \(\pi ^g=\{s_1^g, \ldots , s_K^g\}\) of generators into coherent groups, and vector \(\mathbf {w}\) of bus weights.

To illustrate CSC algorithm, let us bipartition the 9-bus network (see Fig. 1). The spectral graph embedding coincides with that of HSC algorithm (see Fig. 3a). From Fig. 1 we see that generators at buses 2 and 3 have higher dynamic coupling \(\phi _{ij}\), so let us form the two generator groups: \(\{1\}\) (denoted with the black cross in Fig. 3a) and \(\{2,3\}\) (denoted by two black diamonds in Fig. 3a). Any other bus is assigned to the generator group being closest to this bus in the graph embedding (see the dashed arrows in Fig. 3a). Two resulting islands, post-islanding power flows, and islanding performance metrics are presented in Fig. 3b. Load shedding at buses 5 and 6 is required to balance generation and load in the bigger island.

Fig. 3
figure 3

An illustration of CSC algorithm

Like SCCI, CSC algorithm takes care both of generator coherence and power flow disturbance, but CSC is shown in [37] to work faster. At the same time, it disregards load shedding (e.g., compare S in Figs. 2b, 3b). Also, under this methodology the size of islands being created is hardly controllable.

The goal of the present article is to improve spectral clustering algorithms of OCI (namely, HSC and CSC algorithms) by proposing a methodology, which flexibly accounts for generator coherence, power flow disruption, load shed, and island sizes.

The deeper insight into spectral clustering techniques can be found in [38, 39, 45] and in the references provided by these articles.

5 Improved spectral clustering algorithm

5.1 Problem setting

Effect of islanding criterion on the stability of the islanded grid was analyzed in [29]. An experiment with IEEE 118-bus scheme has shown that an islanding scheme with minimum power flow disruption is the most stable one; the scheme with minimum excess demand results in considerably lower load shedding at the cost of longer relaxation time, while under a coherency-based islanding scheme, which ignores disruption and imbalance, generators fail to stabilize.

It is not clear at the moment to what extent these observations generalize to other contingency cases and to other power grids: different performance metrics may be valuable predictors of island stability in different situations. Therefore, a universal algorithm of OCI should combine multiple criteria (generator coherency, power flow disruption, some metric of load shedding, minimum or maximum island volume, and others) and their relative importance should be flexibly adjusted.

Distinct to numerous approaches to OCI that consider preserving generator coherence as a primary optimization goal and calculate coherent generator groups in advance using the two-time-scale theory [8, 10, 11] or recent approaches [4, 28], below the slow coherency detection technique from [16] is adopted and dynamic coupling \(C(\cdot )\) is included into the optimization criterion along with other performance metrics. OCI is considered as a multi-objective optimization problem: to find a K-partition \(\pi =(s_1,\ldots ,s_K)\) of grid graph \(\langle N, E\rangle \) that minimizes the weighted sum of multiple metrics

$$\begin{aligned} F(\pi )=\alpha _C C(\pi )+\alpha _D D(\pi )+\alpha _{ECI} ECI(\pi )+\alpha _S S(\pi ) \end{aligned}$$
(7)

and has the limited maximum island volume:

$$\begin{aligned} w(s_k)\le W,\quad k=1,\ldots ,K. \end{aligned}$$
(8)

The proper choice of weights is discussed in the last section.

Without loss of generality assume \(\alpha _S=1\). Then \(F(\pi )\) can be written as

$$\begin{aligned} F(\pi )=Cut_A(\pi )+S(\pi ), \end{aligned}$$
(9)

where

$$\begin{aligned} A = (a_{ij})_{i,j=1}^n := \alpha _C \Phi +\alpha _D |P|+\alpha _{ECI} \Delta . \end{aligned}$$
(10)

If load shedding is estimated using the maximum flow approximation (see the previous section), OCI reduces to choosing \(x_{ik} \in \{0,1\}\), \(z_{ij} \in \{0,1\}\), \(y_{ij}\in \left[ -\bar{p}_{ij}z_{ij}, \bar{p}_{ij}z_{ij}\right] \), \(l_i \in [0, D_i]\), \(g_i\in [0, G_i]\) for \(i\in N\), \(k=1,\ldots ,K\), \(ij\in E\) to

$$\begin{aligned} \begin{array}{lll} \text {minimize} ~&{}{{\mathrm{tr}}}X^T L(A)X + \sum _{i=1}^n l_i &{}\\ \end{array} \end{aligned}$$
(11)

subject to constraints:

$$\begin{aligned} \text {island volume} ~&\sum _{i=1}^n w_ix_{ik}\le W&\forall k=1,\ldots ,K, \\ \text {nodal flow balance} ~&D_i-l_i=g_i+\sum _{j:ji\in E}y_{ji}-\sum _{j:ij\in E} y_{ij}&\forall i=1,\ldots ,n, \\ \text {islands' detachment} ~&\sum _{k=1}^K k(x_{ik}-x_{jk}) \le K(1-z_{ij}),&\\ -&\sum _{k=1}^K k(x_{ik}-x_{jk}) \le K(1-z_{ij})&\forall ij\in E,\\ \text {vertex partitioning} ~&\sum _{k=1}^K x_{ik}=1&\forall i=1,\ldots ,n. \end{aligned}$$

Here \(X = (x_{ik})\) is an \(n\times K\) indicatory matrix of the grid partition, so that \(x_{ik}=1\) if and only if bus i belongs to island k, \(y_{ij}\) is a real power flow through line \(ij\in E\) (limited by the maximum flow \(\bar{p}_ij\)), \(z_{ij}=1\) if and only if line \(ij\in E\) lies inside a single island (and, thus, is not switched off when the grid is partitioned), \(l_i\) and \(g_i\) are, correspondingly, the amount of load shedding and real power output at bus \(i\in N\), while \(D_i\) and \(G_i\) being, respectively, real power demand and maximum real power output of generator at bus i.

Since the Laplace matrix is always positively semidefinite, the problem in hand is convex MIQP with \(nK+m\) binary and \(2n+m\) real variables. Nevertheless, numeric optimization packages (e.g., CPLEX) cannot be applied directly to this problem due to its high dimension. Below an efficient computational approach is introduced that avoids the combinatorial explosion.

5.2 Idea of algorithm

Matrix A in expression (10) is symmetric and non-negative, so existing algorithms of spectral clustering can minimize \(Cut_A(\pi )\), the first term of cost function (9). At the same time, distinct to the standard graph cut problem the sparsity pattern of matrix A does not coincide with that of graph adjacency matrix due to coupling coefficients \(\phi _{ij}\) that directly “connect” non-adjacent generator buses \(i, j\in N_g\). As a result, the classical spectral clustering algorithm [34] often suggests disconnected islands with lots of generation-rich and generation-deficient connected components (and, hence, with poor load shedding). The connectivity problem can be avoided by using hierarchical spectral clustering [39] that, in most cases, generates connected islands.

Absolute power flow matrix |P| is included, among the others, into matrix A, so partition \(\pi \) with low \(Cut_A(\pi )\) typically has reasonably low flow disruption \(D(\pi )\). In turn, disruption, to some extent, correlates with load shedding: when losses are neglected, the inequality

$$\begin{aligned} S_{EL}(s)=\max \left[ Cut_P(s);0\right] \le Cut_{|P|}(s)=D(s) \end{aligned}$$

holds for any island \(s\subseteq N\), which means that low disruption implies low excess load (but not vice versa, in general). So, when looking for the partition that minimizes a combination of load shedding and disruption, one can limit attention to partitions with relatively low disruption.

We use this observation to propose the Improved Spectral Clustering (ISC) algorithm of controlled islanding. In the first step of the algorithm a limited set of partitions is obtained with low \(Cut_A(\cdot )\), while in the second step a partition that minimizes \(Cut_A(\cdot )+S_{MF}(\cdot )\) is selected from this set (load shedding is estimated using the maximum flow model). To detect more candidate partitions, in the first step different spectral clustering techniques are combined to build several alternative graph rK-cuts, where \(r>0\) is some granularity factor. Then, in the second step, some of rK islands are merged to obtain a K-partition that minimizes \(Cut_A(\cdot )+S_{MF}(\cdot )\).

The following story illustrates the idea of the algorithm. Imagine someone has a porcelain plate with flowers painted on it (see Fig. 4a and wants to divide it into four pieces of roughly equal size keeping flowers unbroken. To obtain such pieces a plate can be sawed carefully (one of possible solutions is shown in Fig. 4b but it is extremely time-consuming. Instead, one can break a plate into small pieces with a strong hammer blow and then glue some pieces back to compose as much unhurt flowers as possible. Although the latter approach may result in a suboptimal solution (e.g., five flowers are broken in Fig. 4c), it is much faster and can be the only alternative when time is expensive.

Details of the algorithm are explained in the next two subsections.

Fig. 4
figure 4

An illustration of the idea of the algorithm

5.3 Step 1: spectral clustering

Below we simplify presentation by assuming that \(\alpha _{ECI}=0\) in (7). To apply normalized spectral clustering, the balanced cut problem with maximum cluster volume constraints is replaced with the corresponding minimum normalized cut problem without cluster volume constraints. Seven different strategies are used in parallel in the first step to build overdetailed partitions. Then, in the second step of the algorithm, each of these partitions is rolled up into a K-partition.

  1. I

    Fixed-granularity strategy for matrix A. Granularity factor \(r_1>1\) (which is a tunable parameter of the algorithm) is chosen and \(r_1K\)-partition \(\pi ^I := \chi _{r_1K}(A|\mathbf {w})\) of the grid is built with HSC algorithm [39].

  2. II

    Fixed-granularity strategy for matrix |P|. It has already been noted that low disruption implies low excess load, so partitions with low \(D(\cdot )\) seem more likely to minimize cost function (7) than those with low \(C(\cdot )\). Therefore, partition \(\pi ^{II}:=\chi _{r_2K}(|P|~|\mathbf {w})\) is computed, where granularity factor \(r_2>1\) is a yet another tunable parameter.

  3. III

    Minimum-granularity strategy for matrix A. From inequality (6) it follows that exactly K Laplacian eigenvectors are enough to characterize a K-partition with small normalized cut. So, overdetailed partitions in Strategies I and II are forced by the need to satisfy volume constraints (8) and to leave some combinatorial space for the load shedding optimization. Instead, in Strategy III partition \(\pi ^{III} := \chi _{K'}(A|\mathbf {w})\) is chosen with minimum granularity \(K'\in \{K, \ldots , r_1K\}\) that satisfies cluster volume constraints (8) (i.e., if \(s\in \pi ^{III}\) then \(w(s)\le W\)).

  4. VI

    Minimum-granularity strategy for matrix |P|. Similarly to the previous strategy, we choose the partition \(\pi ^{IV} := \chi _{K''}(|P|~|\mathbf {w})\) with minimum granularity \(K''\in \{K, \ldots , r_2K\}\) that satisfies cluster volume constraints (8).

  5. V

    Minimum-granularity-refined CSC algorithm. CSC algorithm minimizes disruption under given groups of coherent generators. In our methodology selection of coherent generator groups on the basis of dynamic coupling \(\Phi \) (or alternative generator coherence metric) is a part of the optimization process. In Strategy V coherent generator groups for CSC algorithm are calculated by HSC algorithm applied to reduced dynamic coupling matrix \(\tilde{\Phi }\). Consequently, \(\pi ^V :=\sigma _{K'''}(|P|~| \chi _{K'''}(\tilde{\Phi }|\mathbf {w}), \mathbf {w})\) for the minimal granularity \(K'''\in \{K, r_3K\}\) that allows to satisfy island volume constraints (8) (as before, \(r_3>0\) is a tunable parameter).

  6. VI

    Fixed-granularity sequential strategy for |P| and A. HSC algorithm generates a partition \(\chi _K(|P|~|\mathbf {w})\) that has low disruption but neglects generator coherency. So, recursive bisection for graph edge weights’ matrix A is applied to partition \(\chi _K(|P|~|\mathbf {w})\) to obtain more granulated \(r_4K\)-partition (again, \(r_4>1\) is a tunable parameter), which combines low disruption and coherency. Let us denote with \(\nu (\sigma , \pi |\mathbf {w})\) a partition obtained from partition \(\pi \) by splitting island \(s\in \pi \), which has the biggest volume w(s), with a bisection procedure \(\sigma \). Define

    $$\begin{aligned} \nu _K(\sigma , \pi |\mathbf {w}) = \nu (\sigma , \nu (\ldots ,\nu (\sigma , \pi |\mathbf {w})| \ldots )|\mathbf {w}) \end{aligned}$$

    a partition being a result of K recursive bisections \(\nu \) of the initial partition \(\pi \). Then, \(\pi ^{VI} := \nu _{(r_4-1)K}(\chi _2(A|\mathbf {w}), \chi _K(|P|~|\mathbf {w})|\mathbf {w})\). The idea behind this strategy is that we never miss low-disruption partition \(\chi _K(|P|~|\mathbf {w})\) and potentially can improve by joining islands in another order in the second step of the algorithm.

  7. VII

    Crossing CSC and HSC partitions. CSC algorithm builds a partition \(\sigma _K(|P|~| \chi _K(\tilde{\Phi }|\mathbf {w}), \mathbf {w})\) focusing mainly on generator coherence. On the contrary, HSC algorithm constructs a partition \(\chi _K(|P|~|\mathbf {w})\) caring only for flow disruption. A finer partition \(\pi ^{VII} := \sigma _K(|P|~| \chi _K(\tilde{\Phi }|\mathbf {w}), \mathbf {w}) \wedge \chi _K(|P|~|\mathbf {w})\) is obtained as a meet of these two partitions in the aggregation lattice. Set \(s\subseteq N\) belongs to the meet if an only if \(s = s_1 \cap s_2\) for some \(s_1\in \sigma _K(|P|~| \chi _K(\tilde{\Phi }|\mathbf {w}), \mathbf {w})\) and \(s_2 \in \chi _K(|P|~|\mathbf {w})\).

We limit ourselves to the above seven strategies, although other approaches to granulated partition construction are also possible.

5.4 Step 2: Partitioning aggregated grid

In the second step of the algorithm the granulated partition is transformed into K-partition by fusing some islands together. Each of detailed partitions \(\pi ^I, \ldots \pi ^{VII}\) is processed separately and, probably, in parallel.

First of all, all connected components of islands in partition \(\pi ^i\) are detached making separate islands. Let \(\pi =(s_1, \ldots , s_{n'})\) be the resulting detailed partition where each island is connected (\(n'>K\)).

An aggregated grid is built such that each island \(s_k\) in \(\pi \) becomes its vertex with:

$$\begin{aligned}&\text {maximum real power output}\, G_k':=\sum _{i\in s_k}G_i, \\&\text {current real power output}\, g_k':=\sum _{i\in s_k}g_i, \\&\text {real power demand}\, D_k':=\sum _{i\in s_k}D_i, \\&\text {current real power load}\, d_k':=\sum _{i\in s_k}d_i, \\&\text {injection}\,\,p_k':=p(s_k), \\&\text {volume}\,\, w_k':=w(s_k). \\ \end{aligned}$$

Also, for all island pairs \(k,k'=1, \ldots , n'\) define

$$\begin{aligned}&\text {aggregated real power flows}\,p_{kk'}'=\sum _{i\in s_k,j\in s_{k'}}p_{ij}, \\&\text {real power limits}\, \bar{p}_{kk'}'=\sum _{i\in s_k,j\in s_{k'}}\bar{p}_{ij}, \\&\text {dynamic coupling coefficients}\,\phi _{kk'}'=\sum _{i\in s_k,j\in s_{k'}}\phi _{ij}. \\ \end{aligned}$$

Edge \(kk'\) is included into edge set \(E'\) of the aggregated grid if \(\bar{p}_{kk'}'>0\). Let \(m':= |E'|\) be the edge count of the aggregated grid, \(m'\le n'(n'-1)/2\).

OCI problem (11) for the aggregated grid is a relaxation of the same OCI problem for the original power grid, because it replaces a group of detailed constraints for vertices of one island with a lump-sum constraint for the island as a whole. The aggregated grid problem is MIQP with \(Kn'+m'\) binary and \(2n'+m'\) real variables. Its dimension depends on the dimension of the aggregated grid but not on the dimension of the original grid, and if granularity \(n'\) of partition \(\pi \) is small enough, this problem can be solved in eligible time using exact algorithms implemented in commercial numeric solvers (we use CPLEX 12.6.2.0).Footnote 2

If there are too many disconnected islands in \(\pi ^i\), dimension \(n'\) of the aggregated grid can still be too high for exact algorithms. In this case we suggest decreasing its dimension with the following greedy heuristics.

At each iteration this heuristics simplifies the partition by joining a pair of adjacent islands to minimize cost function (9) while fulfilling island volume constraints. The amount of load shedding in (9) is estimated with the excess demand \(S_{ED}(\cdot )\):

$$\begin{aligned} F_{ED}(\pi ):=Cut_A(\pi )+S_{ED}(\pi )=Cut_A(\pi )+\sum _{s\in \pi } \max [p(s),0]. \end{aligned}$$
(12)

The pseudo code is presented in Listing 1.

figure a

The K-partition calculated by this greedy heuristics is also used as a record in the exact partitioning algorithm of the aggregated grid.

Finally, ISC algorithm selects the partition with the lowest cost of MIQP solution for seven aggregated grids built on the basis of partitions \(\pi ^I, \ldots , \pi ^{VII}\).

The pseudo code of ISC algorithm is presented in Listing 2.

figure b

5.5 Numeric example

This subsection illustrates the proposed ISC algorithm by partitioning the 9-bus network (see Fig. 1) to minimize the sum of dynamic coupling, flow disruption, and load shedding (with unit weights). The desired number of clusters \(K=2\) and the granularity factor is set to \(r_i=1.5, i=1,\ldots ,4\), so, strategies I, II, and VI in the first step of ISC algorithm try to partition the network into 3 islands. Therefore, three eigenvectors are calculated, and the spectral graph embedding in Strategies I and II is three-dimensional. Strategy I is based on matrix A (a linear combination of dynamic coupling matrix \(\Phi \) and absolute power flow matrix |P|), while Strategy II relies solely on matrix |P|. Spectral embeddings for these strategies (see Fig. 5a, b, respectively) differ a bit, but hierarchical clustering of the nodes of the spectral embedding results in the same three islands \(\{1,4,5\}, \{2,7,8\}\), and \(\{3,6,9\}\) outlined in Fig. 5a, b.

In the second step of the algorithm, MIQP (11) is solved for the 3-nodal aggregated network, and the resulting bipartition is equal to that calculated by HSC algorithm (see Fig. 2b). In the considered example, Strategy IV (based on HSC algorithm) and Strategy V (based on CSC algorithm) work as explained in Sect. 4 above, and the resulting islanding schemes are shown in Figs. 2b and 3b, respectively.

Fig. 5
figure 5

Spectral graph embedding and 3-cluster partitioning for strategies I and II

The two-dimensional spectral embedding for Strategy III based on matrix A is presented in Fig. 6a. The corresponding islanding scheme, post-islanding power flows, and islanding performance metrics are shown in Fig. 6b.

Fig. 6
figure 6

Steps of ISC algorithm: strategy III

Strategy VI starts with the K-partition of the network generated by HSC algorithm (see Fig. 2a) and sequentially bisects the biggest island using CSC algorithm until \(r_4K\) islands are obtained. The three resulting islands are encircled by dotted lines in Fig. 7a. In Strategy VII we meet partitions generated by HSC algorithm (the dotted line in Fig. 7b) and CSC algorithm (the dashed line in Fig. 7b) and obtain three islands: \(\{1,4\}\), \(\{2,5,3,8\}\), and \(\{3,6,9\}\). In the second step of the algorithm, MIQP (11) is solved for the corresponding aggregated 3-nodal networks, and the resulting partitions coincide with those shown in Fig. 2b both for Strategies VI and VII.

Finally, the islanding scheme with the minimum cost is selected from the three distinct schemes (those shown in Figs. 2b, 3b, and 6b) obtained by running two steps of ICS algorithm for Strategies I–VII.

Fig. 7
figure 7

ISC algorithm: strategies VI and VII

6 Performance evaluation

6.1 Experimental setup

Three test systems from MATPOWER 5.1 Simulation Package library [53] were used to evaluate performance of the proposed ISC algorithm (see Table 2). These relatively big systems were taken to verify computational efficiency of ISC algorithm and its applicability to bulky real-world grids.

Table 2 Test power systems

For each of three considered power systems 100 test cases were generated by switching off 5 random generators, breaking 5 random lines, and multiplying all demands by a random factor from 1 to 2. For every case consistent power flows were calculated by solving AC–OLS problem with opf routine of MATPOWER 5.1 [53] for dispatchable loads.Footnote 3

Fig. 8
figure 8

Examples of power flow variables’ variation for 100 cases in SMALL system

These power flows and corresponding generators’ angles were used to calculate stability indicators (power flow disruption \(D(\cdot )\) and dynamic coupling \(C(\cdot )\)) when an islanding operation is planned. Information on generator inertia constants was not available, so equal inertia constants were assumed when coefficients of dynamic coupling \(\phi _{ij}\) were calculated.

The sample line power flow and the sample real power output in SMALL grid are depicted in Fig. 8. High variance of these variables shows that considered collection of cases represents a wide range of power flow conditions in a grid. Such massive durability testing of OCI algorithms under the broad variety of grid and flow conditions is rarely performed in the existing literature. The only known exception is [27], where partitioning algorithms were tested under 2000 different power flow conditions in SMALL system.

We did not perform transient stability analysis to verify system stability after islanding, as the main focus of ISC algorithm was to improve the partition quality for existing island stability indicators and their combinations.Footnote 4 At the same time, although simplified metrics of load shedding (such as excess demand) were employed when an optimal islanding scheme was searched, for performance evaluation the AC–OLS model was applied to the system partitioned according to seven partitioning strategies of ISC algorithm.

Additional constraints were imposed when AC–OLS problem was solved for the islanded system: generators’ output was limited by the short-term ramp rate and previously shed load could not be restored in the process of islanding. So, controlled islanding always results in extra load shedding compared to the pre-islanding system state.

Dynamic coupling, power flow disruption, and load shedding had equal weights \(\alpha _C=\alpha _D=\alpha _S=1\) in the optimization criterion (7) of ISC algorithm. The requested island count \(K=4\) was selected for all three systems and granularity factors were set to \(r_1=\cdots =r_4=4\), i.e., 16 islands were demanded when calculating detailed partitions. The maximum island volume was \(W=\frac{3}{8}W(N)\). In particular, this means that any admissible partition with four connected islands has at least three big islands (those having the volume \(\ge \frac{W(N)}{4}\)).

6.2 Partitioning quality

ISC is an approximate algorithm, and there are several possible sources of its inaccuracy that should be inspected.

Firstly, there may be the discrepancy between real objectives of controlled islanding (i.e., preserving the transient stability and preventing a blackout with minimum load shedding) and their representation in the model (7) (a combination of computationally efficient metrics). Although being critical for the final efficiency of an islanding technique, analysis of model adequacy falls beyond the scope of the present article, which concentrates on the existing OCI metrics.

Another important aspect is the method used to evaluate load shedding. The estimate \(S_{AC}(\cdot )\) of the amount of load shedding calculated from AC–OLS model is assumed the most accurate one, but ISC algorithm employs its maximum-flow relaxation \(S_{MF}(\cdot )\) to reduce the problem to MIQP (11). Large discrepancy between these metrics may sufficiently distort the optimization criterion.

Figure 9 shows the relation between \(S_{MF}(s)\) and \(S_{AC}(s)\) for islands met in optimal partitions of SMALL and MEDIUM powers systems. It can be seen from the figure that \(S_{MF}(\cdot )\) correlates wellFootnote 5 with \(S_{AC}(s)\) but systematically underestimates the latter. Also, there are numerous situations when \(S_{MF}(s)=0\) while \(S_{AC}(s)>0\).

Accuracy of \(S_{AC}(\cdot )\) prediction can be improved sufficiently by considering a three-variate non-linear regression

$$\begin{aligned} \bar{S}_{AC}(s) = \max \left[ S_{MF}(s)-a GR_{MF}(s)+bw(s), 0\right] , \end{aligned}$$
(13)

where a and b are regression parameters (grid-dependent, in general), w(s) is the volume of island s, while \(GR_{MF}(s)\) is the generation reserve. The latter is calculated as \(GR_{MF}(s):=\sum _{i\in s} \left( G_i-g_i^*(s)\right) \), where \(G_i\) is the maximum real power output of generator bus \(i\in s\) and \(g_i^*(s)\) is its real power output in the solution of MF–OLS problem for island s.

Fig. 9
figure 9

Relation (scattering plot) between AC–OLS-based load shedding \(S_{AC}(s)\) (vertical axis) and maximum-flow-based load shedding \(S_{MF}(s)\) (horizontal axis). A diagonal line\(y=x\) is added for clarity

Fig. 10
figure 10

Scattering plot for \(S_{AC}(s)\) (vertical axis) and \(\bar{S}_{AC}(s)\) (horizontal axis). Circles denote points of the training set, while pluses go for the testing set

For comparison, scattering plots (predicted value \(\bar{S}_{AC}(s)\)vs exact value \(S_{AC}(\cdot )\)) of regression (13) under the best parameter values (\(a=0.43, b=0.02\) for SMALL system and \(a=0.97, b=0.04\) for MEDIUM system) are shown in Fig. 10a, b, respectively. Correlation is 0.92 with mean absolute error (MAE) 13.9 MW on the training set and is 0.94 with MAE 14.1 on the testing setFootnote 6 for SMALL system. For MEDIUM system correlation is equal to 0.99 with MAE 39.2 MW on the training set and is equal to 0.98 with MAE 39.5 MW on the testing set.Footnote 7

To take advantage of the more accurate regression (13) only minor change in MIQP setting (11) is needed. It is enough to introduce the new non-negative continuous variables \(\sigma _k\ge 0\), \(k=1,\ldots ,K\), which satisfy inequality constraints

$$\begin{aligned} \sigma _k\ge \sum _{i=1}^n x_{ik}\left[ l_i-a(G_i-g_i)+bw_i\right] ,k=1,\ldots ,K \end{aligned}$$

and define the new cost function

$$\begin{aligned} {{\mathrm{tr}}}X^T L(A)X + \sum _{k=1}^K \sigma _k. \end{aligned}$$
(14)

The revised MIQP still has linear constraints and the convex cost function. Therefore, its complexity does not increase, while its solution accurately predicts the optimal value of the cost function

$$\begin{aligned} F(\pi )=\alpha _C C(\pi )+\alpha _D D(\pi )+\alpha _S S_{AC}(\pi ) \end{aligned}$$

in the considered test systems and is expected to meet application-specific accuracy requirements.

The final aspect of algorithm accuracy is that MIQP (11) is solved only approximately by ISC algorithm due to high problem dimensionality, and the approximation error of ISC algorithm should be estimated by comparing the ISC solution to the exact solution of MIQP (11). Due to computational intractability of problem (11) for large grids, such a direct error evaluation is impossible for MEDIUM and LARGE system, while for SMALL system the mean relative error of ISC algorithm for 100 cases is just \(2\%\) (the maximum error is \(9\%\)). Therefore, approximation error is negligible.

Let us continue with algorithm benchmarking. The proposed ISC algorithm is based on CSC and HSC, the most efficient spectral-clustering-based algorithms for OCI. It is natural to use them as a benchmark in performance evaluation of ISC for large grids where exact solution is intractable. Unfortunately, K-partition \(\chi _{K}(|P|~|\mathbf {w})\) computed by HSC algorithm is often imbalanced, and so does K-partition \(\chi _{K}(\tilde{\Phi }|\mathbf {w})\), which we used to elicit coherent generator groups for CSC.

As a consequence, both CSC and HSC applied directly to our data in most cases return imbalanced partitions being highly impractical for controlled islanding. Figure 11 shows the volume of the largest island (relative to the total grid volume) for SMALL and MEDIUM test systems. The horizontal line shows the maximum volume constraint. HSC partitions are shown with circles, while crosses stand for CSC partitions. It follows from the figure that CSC returns a balanced partition only in 18 cases of 100 for SMALL system, while HSC results in a balanced partition in 12 cases. For MEDIUM system HSC returns a balanced condition just once, while CSC partitions are always imbalanced.

Fig. 11
figure 11

Relative volume (percent) of the largest island for K-partitions generated by CSC and HSC. Case numbers are depicted along the horizontal axis

Win rates of seven strategies of ISC algorithm for three test systems are presented in Table 3. The winning strategy minimizes the cost function \(F_{AC}(\cdot )\) with load shedding evaluated from AC–OLS problem. Strategies VI and II appear the most successful for all three test systems, while Strategies IV and V only occasionally win (These strategies represent the closest balanced analog of HSC and CSC algorithms consequently, so they can be, to some extent, used as a baseline to compare ISC to the competing algorithms.)

Table 3 Win rate (percent) of different strategies of ISC algorithms

The average partitioning quality of ISC algorithm for SMALL test system is depicted in Fig. 12a. The value of cost function is enclosed in a frame, all its components are also presented for all seven strategies and for the winning strategy. Although Strategy II (fixed granularity for matrix |P|) does not account directly for generator coherency, it suggests competitive coherency cost and wins in disruption and load shedding. At the same time, Strategy VI (sequential partitioning) has the same average cost due to a little bit lower generator coherency under a slightly higher disruption and load shedding. These two strategies remain the leaders for all three test systems.Footnote 8

A yet another conclusion is that considering excessive number of eigenvectors (which is the main idea of ISC algorithm) is essential to achieve good performance, since Strategy IV, which differs from Strategy II only in the number of eigenvectors considered, is inferior. The same is true for Strategy V, which is based on CSC and also operates with the limited number of eigenvectors. Performance metrics of ISC strategies for MEDIUM system are presented in Fig. 12b.

The similar analysis for LARGE system is hindered by the fact that for this system some strategies often suggest imbalanced partitions (Strategies I, V, and VII show the worst success rate: 24, 32, and 34%, respectively) even when 4 K eigenvectors are calculated. Since island volume constraints are mandatory, ISC algorithm neglects imbalanced partitions suggested by some strategies when selecting the final solution, and there is no common base for the comparison of these strategies.

Therefore, maintaining several different partitioning strategies is important to always obtain a good islanding solution in large power systems.

Fig. 12
figure 12

Average performance metrics for Strategies I–VII of ISC algorithm. Total cost \(F_{AC}(\cdot )\) is framed. Generator coherence \(C(\cdot )\), flow disruption \(D(\cdot )\), and load shedding \(S_{AC}(\cdot )\) are also shown

6.3 Flexibility

In many existing OCI algorithms [16, 32, 37, 39, 48] an optimization criterion cannot be tuned to assign different importance to different performance metrics of controlled islanding. Distinct to them, in ISC algorithm the weights of cost function components (dynamic coupling, disruption, and load shedding) can be chosen arbitrary leading to different resulting partitions. For example, in Fig. 13 optimal islanding schemes for SMALL system under several extremal settings are presented. In Fig. 13a weights in cost function (7) were chosen to minimize dynamic coupling \(C(\cdot )\) while ignoring \(D(\cdot )\) and \(S(\cdot )\). On the contrary, in Fig. 13b disruption \(D(\cdot )\) is minimized with no attention paid to \(C(\cdot )\) and \(S(\cdot )\). The partition that balances generators’ dynamic coupling \(C(\cdot )\) and disruption \(D(\cdot )\) is presented in Fig. 13d, and the one taking care only for the shed load \(S(\pi )\) is presented in Fig. 13e.

For comparison, the best alternative partition (calculated by HSC algorithm) and its performance metrics are presented in Fig. 13c. Therefore, every single performance metric of HSC partition or a combination of metrics can be improved by ISC algorithm, and the optimization criterion can be flexibly tuned to adjust relative importance of different metrics.

Fig. 13
figure 13

Optimal islanding schemes for different cost functions

6.4 Computational complexity

Compared to the competitive spectral partitioning algorithms (CSC and HSC), the proposed ISC algorithm requires additional calculations: in the first step several overdetailed spectral partitions are calculated (strategies I– VII are introduced in Sect. 4), and in the second step every detailed partition \(\pi ^\text {I}, \ldots , \pi ^\text {VII}\) is merged into a K-partition by CPLEX optimization routines.Footnote 9 Therefore, total computation time of ICS algorithm is equal to the maximum processing time for partitioning strategies I, ..., VII (although the first suggestion is given as soon as the fastest strategy completes).

The detailed statistics of MIQP solution time is shown in Fig. 14. Solution time is lower for minimum-granularity strategies III–V due to lower dimensionality of the problem. On the other hand, solution time depends insignificantly on the test system as the order of an aggregated graph does not depend on the size of an initial grid but only on the number of connected islands \(n'\) in the detailed partition, which linearly depends on the number K of islands requested (remember that \(K=4\) in all experiments).

Figure 15 presents the average MIQP computation time as a function of dimensionality of aggregated graph \(n'\) (SMALL system is used for this test). Computation time grows exponentially, therefore, for \(n'>K_{max}\approx 20\) network dimension should be reduced by the greedy algorithm before running MIQP.

Figure 16 illustrates average time complexity of the first step of ISC algorithm for all seven parallel strategies. Comparing to Fig. 14, we conclude that the second step of ISC algorithm is fast enough compared to the first step in MEDIUM and LARGE systems, and its computation time is significant only in SMALL system.

The second step of ISC algorithm can be fostered in two ways. The first issue is high variability of computation time (A typical problem of branch-and-bound procedures.) Sometimes the exact solution takes much longer than usually, and it may be critical for online OCI applications. This problem is solved by imposing a sharp time limit on MIQP calculations. Fortunately, in ISC algorithm the branch-and-bound procedure always starts with a feasible record solution produced by the greedy algorithm. The greedy algorithm is fast enough (less than 0.03 s on average) and often provides good solutions (the average cost improvement of the exact algorithm over the greedy one varies from 1.5% for LARGE system to 7% for SMALL system).

The second approach to foster MIQP calculations assumes increasing the number of processing units. Branch-and-bound procedures are easily parallelized, and doubling the number of processors almost doubles the performance.

Fig. 14
figure 14

Time complexity of MIQP (the second step of ISC algorithm) for 3 test systems and 7 partitioning strategies. The median time and quartile boundaries are presented

Fig. 15
figure 15

Time complexity of MIQP vs order of aggregated graph being partitioned (logarithmic scale)

Fig. 16
figure 16

Average time of eigenvectors’ calculation (the first step of ISC) for 7 parallel strategies of ISC algorithm. Eigenvectors were evaluated with eigs routine of Matlab R2014a run on Intel Core i-5 3337U CPU (1.8 GHz) laptop

Figure 16 shows that evaluation of eigenvectors of the normalized Laplace matrix is the most computationally expensive operation of spectral clustering algorithms (at least, for realistically sized grids). Its complexity depends on the number of eigenvectors requested, on matrix dimension, and on its sparsity ratio.

In Strategies I and III \(r_1K\) eigenvectors of matrix \(L_{sym}(A|\mathbf {w})=L_{sym}(\alpha _C\Phi +\alpha _D|P|~|\mathbf {w})\) are calculated. Strategies II and IV require calculation of \(r_2K\) eigenvectors of \(L_{sym}(|P|~|\mathbf {w})\). The latter matrix is more sparse and, therefore, the first step of Strategies I and III is computationally more expensive than that of Strategies II and IV.

Strategy V (CSC-based algorithm) is slightly more computationally expensive than Strategy IV (HSC-based) for SMALL and MEDIUM systems as it involves calculation of the eigenproblem for the Laplacian of denser matrix \(\tilde{\Phi }\) of generators’ dynamic coupling. At the same time, for LARGE system Strategy V is faster than Strategy IV due to the smaller size of matrix \(\tilde{\Phi }\).

Strategy VI involves the nested spectral bisection of K-partition and appears the most time-consuming strategy for all three test systems. On the other side, it is the most successful one (according to Table 3) and is very reliable in generating admissible (balanced) partitions. Comparing Figs. 16 and 12 we see that Strategy II is approximately three times faster than Strategy IV and results in only minor loss of partitioning quality. Therefore, one may consider abandoning Strategy VI if it fails to satisfy application-specific time limits.

Strategy VII takes partitions \(\pi _\text {IV}\) and \(\pi _\text {V}\) as an input, so its computation time is the maximum of those for Strategies IV and V (computation time of the meet operation can be neglected).

For SMALL system eigenvector calculation is very fast, and, hence, ISC algorithm is slower than CSC [37] and HSC [39] algorithms (mostly, due to the additional second step, which takes about 2 sec. irrespective of the system dimension). For MEDIUM system the speed is comparable (15 seconds is reported for HSC [39], while CSC algorithm was tested only for the reduced Great Britain 815-bus system in [39]). These algorithms cannot be directly compared for LARGE system, since CSC and HSC fail to provide eligibly balanced partitions. If we relax the maximum island volume constraint, we can expect ISC to be at least three times slower than CSC and HSC, mostly due to the slow Strategy VI (which, as noted above, can be abandoned in case of time deficit).

It should be noted that for LARGE system (and, perhaps, for MEDIUM system) neither of existing spectral-clustering-based algorithms is fast enough for online islanding, when run on a PC. Eigenvectors’ calculation takes most time, and, fortunately, it can be dramatically fostered by using parallel vector calculus capabilities of graphical processing units (GPU) [5, 31]. Hence, fast implementation of eigenvector calculation for spectral clustering OCI algorithms in large power systems is no more than a programming issue. Potentially, the performance improvement can also be obtained by replacing spectral clustering with some algorithms of NCut minimization based on semidefinite programming [3, 21]. A yet another promising approach is to use deep learning techniques to replace the optimization problem solution with fast calculation by a multilayered artificial neural network. In this case the algorithm proposed in this article calculates an extensive data set used to learn the neural network. Comparison of these approaches requires additional work.

7 Conclusion

An improved version of spectral-clustering-based OCI algorithms [37, 39] is proposed in this article, which accounts for generator coherency, power flow disruption, load shedding and island size, allowing to adjust flexibly relative importance of these performance metrics and caring for the maximum island volume.

An optimal islanding scheme is sought in two steps. Firstly, several alternative detailed grid partitions are determined that contain more islands than required while balancing generator coherency and power flow disruption. Secondly, some islands in these partitions are merged to optimize the weighted sum of generator coherency, power flow disruption, and the amount of load shedding while fulfilling the maximum island volume constraint. The second step reduces to MILP whose dimension does not depend on the dimension of the original grid and is small enough to use exact algorithms.

A series of experiments for three standard test grids was performed. The results show that, compared to competing algorithms, HSC [39] and CSC [37], the proposed improved spectral clustering algorithm results in substantial improvement of both the overall composite criterion of partition quality (more than twice for some grids) and of each its component. The algorithm is computationally efficient. Potentially, it can be used to partition large power systems in real time.

The proposed algorithm can also be helpful in applications other than energy systems to partition high-dimensional directed graphs with asymmetric matrix of vertex weights.

The multi-objective criterion studied is a weighted sum of various performance metrics (generators’ dynamic coupling, power flow disruption, ECI, excess demand, load shedding, and, probably, others), but the choice of performance metrics included into the optimization criterion and assignment of their relative weights is an open question in general. Further research is needed to establish an empirically grounded relation between transient stability of an island being created (e.g., with simulation techniques from [3, 40]) and computationally efficient performance metrics (some of which were mentioned above).