Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The 2003 Northeast blackout uncovered the vulnerability of the US power system, and manifested the urgent need for real-time state monitoring and control of the grid leading to the development of the Wide-Area Measurement Systems (WAMS) technology. Under the auspices of the US Department of Energy and the North American Synchrophasor Initiative, development and deployment of high-resolution, GPS-synchronized PMUs have been greatly accelerated, together with the development of many new WAMS architectures and applications. However, as the number of PMUs scales up to over thousands within the next few years, Independent System Operators (ISO) and utility companies are struggling to understand how the resulting gigantic volumes of real-time data can be efficiently harvested, processed, and utilized to solve wide-area monitoring and control problems for any realistic power system. Given the complexity and scale of next-generation power grids, an important lesson that researchers have learnt is that the design of and the capability to run at-scale wide-area controllers is indispensable and yet extremely difficult. Currently, there are six main research challenges that need to be resolved before wide-area control can transition from a concept to a reality. These challenges can be listed as follows.

  1. 1.

    Scalability: The first and foremost challenge for designing tractable wide-area controllers is scalability. Any typical power system network in reality would consist of several hundreds to thousands of buses, generators, and loads that are spatially distributed over wide geographical spans. Developing tractable methods for modeling, simulation, and control of such large complex networks, and implementing those designs through affordable communication continue to be a challenge for power system engineers. Foundational work on taxonomy theory for modeling and analysis of extreme-scale models of power systems has been done in theory [1], but its translation to simulation models is still missing. Software such as Modelica and Hydra, for example, need to be exploited for modeling scalability through modularity, composition, static correctness, implicit representations, and structural dynamics. All of these selected abstractions then need to be brought under the umbrella of a common modeling language and the front end of a compiler, followed by a library and language-level abstractions that support the needs of experimentation. Similarly, from a design standpoint, conventional state-feedback and output-feedback controllers such as Linear Quadratic Regulators (LQR) and Linear Quadratic Gaussian (LQG) control involve the computation of large matrix decompositions that can result in detrimental numerical inaccuracies without any guarantee of robustness. They also demand every node in the network to share its state information with every other node, resulting in an impractically large number of communication links. Traditionally, control theorists have addressed the problem of controlling large-dimensional systems by imposing structure on controllers. The most promising approach, for example, started with the idea of decentralized control [2], followed by techniques such as singular perturbation theory [3], balanced truncation [4], and gap reduction methods among others. These methods aim to simplify the design of controllers for large systems by exploiting weak coupling between their state variables, and by ignoring states that are “less important” than others [5, 6]. The trade-off, however, is that the resulting controllers are often agnostic of the natural coupling between the states, especially the coupling between the closed-loop states, since many of these couplings were forcibly eliminated to facilitate the design itself. Therefore, extending these methods to facilitate controller designs for networks, especially to power networks where states may be defined over highly structured topologies such as spatial clustering of generators and loads [7], is quite difficult. A significant literature exists on controllability and observability properties of power networks, but the literature for developing tangible and yet simple low-dimensional controllers that satisfy global stability and dynamic performance requirements is still, unfortunately, very sparse. Ideas on aggregate control [8], glocal control [9], and hierarchical control [10] have recently been proposed to address this challenge. The goal of these designs, however, is to guarantee global closed-loop stability by modular tuning of local controller gains. Their degrees of freedom for guaranteeing a desired closed-loop performance can be limited. Some recent papers such as [11] have used structural projection-based ideas for model reduction of large networks, but not for control designs. Attention has also been drawn to designing controllers for large systems by finding low-rank solutions of algebraic Riccati equations [12]. However, like most Krylov subspace-based reduction methods, these controllers are unstructured, and hence demand as many communication links as the full-order LQR itself. Distributed controllers using model matching [13], sparsity-promoting LQR [6], and structured LQR [14, 15] promise to reduce the communication density, but their designs inherit the same dimensionality as the full-order design. What designers are lacking is a tractable approach for constructing controllers that can facilitate both design and implementation, preferably at the scale of several thousand buses. A promising rationale behind such an approach can be, for example, to exploit the clustered structure of the controllability Gramian of the closed-loop system with LQR state-feedback control, as shown recently in [16]. Clustering of generators and other assets opens up further opportunities in consummating control theory with machine learning and computer science, and, therefore, can be a very lucrative choice for designing large-scale wide-area controllers. Yet another important challenge is to ensure robustness of these controllers so that the network can still function gracefully in case a generator equipped with an important wide-area PSS fails for any reason, or in case the power system model changes drastically between two consecutive disturbance events thereby invalidating a fixed control design, and so on.

  2. 2.

    System Identification and Model Validation: Given the large size and extraordinary complexity of any realistic power system, deriving and simulating the dynamic model for an entire network becomes extremely challenging. Constructing approximate, aggregated, reduced-order models using simplifying assumptions, therefore, becomes almost imperative in practice. Papers such as [17] have defined aggregation methods for simulation time, for generation units, and for load demand units. The performance of the aggregated models is checked against detailed models including binary effects such as minimum down-time, minimum generation, or demand side contracts. This is especially important for control designs such as model predictive control, where very large optimization problems need to be solved online. The optimization in these cases usually has to be simplified by using approximated power plant models, aggregating several assets to single units, and limiting controller foresight. The main question is—if modeling granularity is necessary, then what degree of aggregation is acceptable in the asset domain as well as in the time domain?

    Not just for reducing the complexity of simulations, model aggregation may also be necessary if one is purposely interested in simulating only a certain part of the grid, or a certain phenomenon that happens only over a certain timescale. For example, one may be interested in simulating only the low-frequency inter-area oscillations of a group of synchronous generators instead of the entire spectrum of their frequency response. Identifying coherent subgroups among this group of generators, aggregating the subgroups into an equivalent hypothetical generator, and analyzing the oscillation patterns of these equivalents become necessary in that situation. For example, we often hear power system operators mentioning how “Northern Washington” oscillates against “Southern California” in response to various disturbance events. The main question here is whether we can analytically construct dynamic electromechanical models for these conceptual, aggregated generators representing Washington and California, which in reality are some hypothetical combinations of thousands of actual generators. One example for this motivating problem is the Pacific AC Intertie system in the US west coast, a five-machine dynamic equivalent mass–spring–damper model for which has been widely used in the literature [18]. The main question, again, is—how can we construct an explicit dynamic model for this conceptual figure, and that too preferably in real-time, using voltage, current or power flow signal measurements in order to establish a prototype for the nonlinear inter-area dynamics of the entire interconnection?

    Recently, several papers such as [19, 20] have addressed this problem, and derived a series of results on model reduction based on Synchrophasor measurements that combines aggregation theory with system identification. Several open questions still exist, however. For example, once a baseline model is constructed, one must study how it can be updated at regular intervals using newer PMU data. Ideas from adaptive learning and decomposition theory [21] can be useful for that. How this updated model can be used to predict the slow frequencies and corresponding damping factors also needs to be formalized and validated via realistic simulations. Questions also exist on how the reduced-order model can predict the sensitivity of the power flow oscillations inside any area with respect to faults in any other area. If answered correctly, utilities can exploit this information from simulations of the aggregated model, and evaluate their dynamic coupling with neighboring companies, leading to more efficient resource planning. Significant amount of work still needs to be done in formalizing how different failure scenarios in the actual full-order grid model can be translated to the aggregated model, what kind of advanced signal processing and filtering need to be applied to PMU data for accurate identification of the aggregated model parameters, and how controllers designed based on the aggregated model can be mapped back to the original system for implementation.

    An equally significant challenge is the validation of identified models, whether they be full-scale models or approximated models. This is particularly true for the dynamic models of the generators and their associated controls. After the large blackout in the US west coast in 1996, it became clear that the generator models used in the studies were not accurate enough. Since then the Western Electricity Coordinating Council (WECC) standard requires updating of the model parameters through standardized testing. Power systems in other parts of the world where stability is an issue have followed a similar approach. However, off-line testing is expensive and as PMUs have proliferated over the past decade, using PMU data during disturbances to update model parameters online has become more common. Several challenges still stand on the way. For example, new technologies such as renewable generation sources and storage that require power-electronic grid interfaces have completely new types of dynamic models. Together with HVDC and FACTS devices, developing accurate modeling methods and devising procedures to validate these models and their parameters are currently lagging behind.

  3. 3.

    Wide-area communication: Another major roadblock for implementing wide-area control in a practical grid is that the current power grid IT infrastructure is rigid and low capacity as it is mostly based on a closed-mission specific architecture. The current push to adopt the existing TCP/IP-based open Internet and high-performance computing technologies such as the NASPInet [22] would not be enough to meet the requirement of collecting and processing very large volumes of real-time data produced by such thousands of PMUs. Moreover, the impact of the unreliable and insecure communication and computation infrastructure, especially long delays and packet loss uncertainties over wide-area networks, on the development of new WAMS applications is not well understood. The need for having accurate delay models and network synchronization rules is absolutely critical for wide-area control since the timescale of the physical control loop is in the order of tens of seconds to a few minutes, while the spatial scale can range over thousands of miles, for example, the entire west coast of the US [23]. The existing PMU standards, IEEE C37.118 and IEC 61850, only specify the sensory data format and communication requirements. They do not indicate any dynamic performance standard of the closed-loop system. One needs to develop a cyber-physical framework where one can explicitly show how the closed-loop dynamic responses of phase angles, frequencies, voltages, and current phasors at any part of a grid model are correlated to real (not simulated) network delays, that arise from transport, routing, and most importantly, from scheduling, as other applications are running in the shared network. Several researchers have looked into delay mitigation in wide-area control loops, with controllers designed for redundancy and delay insensitivity [24,25,26]. All of these designs are, however, based on worst-case delays, which makes the controller unnecessarily restrictive, and may degrade closed-loop performance. Instead, what one really needs is to understand what may be the most common queuing protocols for transmitting PMU data over a shared wide-area communication network, and how will prior knowledge of these protocols help one in estimating variable queuing delays, and subsequently use that knowledge for designing delay-aware wide-area control designs instead of the traditional approach of delay tolerance. Ideas from real-time calculus and arbitrated network control systems, both of which have recently been shown to be highly promising tools for this purpose in embedded system designs can be used for this analysis [27]. The goal here is twofold—first, to characterize closed-loop response of a large power grid in terms of distinct performance metrics, and second, to derive analytical expressions for the error bounds between ideal designs and delay-aware designs as explicit functions of the queuing protocols. Besides delays, other challenges in communication such as packet drops, bad data detection, synchronization issues, and problems arising from quantization of PMU data also need to be addressed.

  4. 4.

    Cost allocation: Another challenge is the economics behind wide-area control. Installing a wide-area communication and control infrastructure would require significant monetary investment from the ISO and utilities. Currently, there are no incentives or markets for wide-area control. Hence, it is not clear how these companies can jointly decide the use and deployment of communication links for achieving global control objectives in the most economical way. For example, if different generators within the balancing regions of different companies are benefiting differently in terms of oscillation damping, transient or voltage stability margins, etc., then how much cost benefit does that company gain by transcending from selfish or completely local control to a system-level wide-area control? Ideas of cooperative game theory and distributed real-time control methods need to be combined to develop efficient and robust cost-sharing mechanisms before the controllers can be implemented in reality [28]. The sensitivity of cost allocation to controllers with and without network delays also need to be tested. In fact, the final goal can even be to create a power system market for wide-area controls, where pricing and incentives are decided not only by steady-steady power flows but also their dynamics and transient oscillations, all of which cause electrical wear and tear in the electrical excitation system inside the generators.

  5. 5.

    Cybersecurity: The next challenge is resilience, privacy, and cybersecurity. The main question here is: how can we ensure privacy of PMU data, and security and resilience of wide-area computing and communication architectures against nefarious attacks and failures at both cyber and physical layers? Existing networking solutions need to be used to evaluate distributed server-based and peer-based architectures, and their potential use in security of wide-area control [29]. Attention must be paid to all three layers of resilience, i.e., detection, localization, and mitigation of attacks [30]. With several thousands of networked PMUs being scheduled to be installed in the United States by 2020, exchange of Synchrophasor data between balancing authorities for any type of wide-area control will involve several thousands of Terabytes of data flow in real-time per event, thereby opening up a wide spectrum of opportunities for adversaries to induce data manipulation attacks, denial-of-service attacks, GPS spoofing, attacks on transmission assets, and so on. The challenge is even more aggravated by the gradual transition of WAMS from centralized to distributed in order to facilitate the speed of data processing. Several recent papers have studied how false-data injection attacks may be deceptively injected into a power grid using its state estimation loops. Others have proposed estimation-based mitigation strategies to secure the grid against some of these attacks. The fundamental approach behind many of these designs is based on the so-called idea of Byzantine consensus, a fairly popular topic in distributed computing, where the goal is to drive an optimization or optimal control problem to a near-optimal solution despite the presence of a malicious agent. In practice, however, this approach is not acceptable to most WAMS operators as they are far more interested in finding out the identity of a malicious agent if it exists in the system, disconnect it from the estimation or control loop, and continue operation using the remaining nonmalicious agents rather than settling for a solution that keeps the attacker unidentified in the loop. This basic question of how to catch malicious agents in distributed wide-area monitoring applications is still an open challenge in the WAMS literature. Ideas on differential privacy are also currently being researched to ensure privacy of PMU measurements so that sensitive information about system parameters, line flows, and load consumptions that may be embedded in these measurements cannot be deciphered accurately by malicious users [31].

  6. 6.

    Simulation testbeds: The final challenge is to create a reliable simulation testbed that can be used for verification and validation of various cyber and physical level experiments for wide-area control of very large-dimensional power system models. In current state-of-art, using PMU data for research purposes is contingent on accessing the real data from specific utility companies that own the PMUs at the locations of interest. Gaining access to such data may not always be an easy task due to privacy and nondisclosure issues. More importantly, in many circumstances, even if real PMU data are obtained they may not be sufficient for studying the detailed operation of the entire system because of their limited coverage. WAMS researchers are in serious need for a Hardware-in-Loop (HIL) simulation framework where high-fidelity detailed models of large power systems can be simulated. These simulations, for example, can be done using Real-time Digital Simulators (RTDS) and Opal-RT, and the dynamic responses can be captured via real hardware PMUs from different vendors, that are synchronized via a common GPS reference. These types of physical-layer testbeds also need to be federated to metro-scale, multilayered dynamic optical network testbeds, an example being the Breakable Experimental Network (BEN) [32], owned by the GENI project of the US National Science Foundation. The resulting testbed infrastructure would not only be relevant for WAMS, but can also complement other emerging and well-established networking testbeds around the country for different cyber-physical applications, transportation, Internet of Things, smart manufacturing, and cybersecurity. Efforts must also be made to make these testbeds as much available to the power system research community as possible so that researchers from other institutions, both nationally and internationally, can use them for carrying out their own experiments using remote connection. Appropriate measures of privacy, security, and safety must be also imposed on such remotely accessible testbeds to ensure smooth and safe usage by multiple parallel users.