Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

When reliable systems or operations are envisioned, there is an expectation that the entity being considered will produce consistent results, preferably meeting or exceeding some standard of operation, within some uncertainty. Often, reliability also has a temporal component associated with it.

The reliability of providing consistent service to end users is becoming a challenge as the demand and end users’ expectations for energy increase. At times, this has resulted in power grids approaching their limits, and in the severest cases blackouts have occurred in parts of the grid. Figure 1 shows the frequency of transmission outages based on data from the NERC Disturbance Analysis Working Group (DAWG). The figure shows approximately 24 outages per year in the US with curtailments in the 100–1,000 MW range, about 5 outages in the 1,000–10,000 MW range, and one outage every 4 years at 10,000+ MW [1]. The large-scale outages are not unique to one country or a specific region or part of the world [2], and could be triggered by mechanical failures in the power grid networks or by external forces such as natural calamities (earthquakes, hurricanes, etc.) and more recently the threat from human-induced damages (e.g., cyber attacks).

Fig. 1
figure 1

Blackout frequencies, 1984–2005

The electric grid is designed and operated to withstand any single (and often double) contingency by its protection and controls system. Therefore, a common phrase used during a postmortem event analysis is “Relays cannot start a disturbance”. However, this is not exactly telling the whole story. There can be a significant impact to the grid’s operation and reliability if a relay misoperation occurs during a contingency (i.e., a fault). Incorrect operation is both failure to detect or unplanned action. In most widespread disturbances, there is usually a misoperation that aggravates the events.

In the last few years there have seen several events in North America, and abroad, that might have been contained if it were not for unanticipated and unexpected protection system performance. Often, the test of properly functioning relaying system is during contingencies or faults. Protection systems (protective relays and associated relay systems) are expected to perform reliably during a grid disturbance. This expectation places a priority in maintaining the highest degree of reliability in the protection systems for ensured safe and reliable operation of the grid.

2 The Reliable Grid Challenge

Understanding the complexities of the interconnected power grid, need for proper planning, good maintenance, and sound operating practices are key components of an effective strategy in grid reliability. Formulation of reliable power system strategies begins with accurate modeling and system analysis of strengths, weaknesses, limitations, expectations, and the interactions. Today’s grid requires a multi-scaled system approach to define technology-based solutions for reliable operation. These scales include:

  • The big picture Interconnected grid system with defined boundaries, and forward looking solutions for the overall grid and telecommunication infrastructures.

  • The providers of power Individual power producers and power companies and associated support over life cycle.

  • The component level Individual elements affecting the system such as generation siting, substation capacity, or the end user.

  • Move toward standardization and use of open solutions to support harmonization, transparency and interoperability.

The result of such an approach is the ability to provide explicit, normative explanations, and working definitions for common information being used amongst different groups in operation, planning, and protection.

Evolutions in protection and restoration principles for the smart grid are being made possible by wide-area measurement systems (e.g., PMUs). Real-time adjustment of the protection system’s security-dependability is within reach given the advancements in technology and investments in communication system infrastructures. Further improvements with standardization, processing relay settings, event recordings, and distributed data sources can also be achieved through ontology or knowledge-based semantics.

Resource and transmission adequacy are necessary components of a reliable and economic supply. Though the reliability and market economics are driven by different policies and incentives, they cannot be separated when the objective is reliability and availability. Today, grid planning faces an extremely difficult task given the challenge to achieve resource adequacy in today’s restructured industry, where market economics and local concerns often drive many of the decisions.

It is important to take a fresh and balanced approach to viewing the system as a whole by implementing various planning, operations, maintenance, and regulatory measures and weighing the costs, performance impact and risks associated with each measure.

This chapter provides a vision and the roadmap for creating actionable intelligence for reliable and real-time grid operation by capturing the knowledge and experience of the power system operation and control personnel and merging this knowledge with real-time data from disparate sources. Elements of this approach include formulation of a multi-disciplinary team, knowledge discovery methods that encompass electrical and computer engineering and industrial and systems expertise through industry collaborators. Many of the building blocks of this approach have already been established and results have been demonstrated in literatures. See [3] and other references in support of the roadmaps and building blocks. It is anticipated that this new perspective will yield results to improve system reliability and overall system performance.

3 A Technology-based Solution

Early energy management systems, such as Supervisory Control and Data Acquisition (SCADA), were specially developed electronic devices running specialized operating systems providing only fundamental functions like real-time data processing and collecting, but nearly no real-world applications. Also, proprietary databases/interfaces restricted access by third parties to design innovative Information Technology (IT) solutions that overcome data exchange problems. Several applications in the areas of Customer Relationship Management (CRM), Geographic Information Systems (GIS), SCADA/ERM (Extended Runtime Modules), Distribution Management Systems (DMS), Outage Management Systems (OMS), Customer Information Systems (CIS), need to integrate and exchange data and information seamlessly (i.e., interoperability).

Figure 2 [4] shows the timing of events in the electric power grid and the reaction times commonly available for either an automated or operator intervention. At one end of the spectrum are the actions taken by the power system protection equipment to take an automated response based on measured quantities (e.g., frequency, current). At the other end of the spectrum, the operator controls the system in a steady-state mode using data acquired from a host of sensors via a SCADA system. Actions may be automated or are more often made based upon an operator’s visual interpretation of the data presented through a variety of meters and display devices.

Fig. 2
figure 2

Critical timing and reaction times for power grid operations

In steady-state operations, an operator normally has adequate time to consider the data, consult text-based help guides, or seek another operator’s opinion before having to make a decision. Between these two ends of the spectrum is a time in which operators may have to make decisions based simply on heuristics or past experiences. Obviously, these actions may not result in the best outcome for reliability. This is the critical time period in which immediate actions must be made by system operators to prevent wide area collapses of the grid.

To ensure the secure and stable operation of the power system across the temporal spectrum, it is required to develop and apply new decision support tools that provide actionable intelligence in the required timeframe. In the aspects of secure and stable control, we need to think of an automatic pilot power system concept, representing a trend to improve Energy Management Systems (EMS). To reach reliable real-time operation in auto-pilot, we need many tools and services that should be configured to work cohesively, including Operator Training System (OTS), Dynamic Security Analysis (DSA), Optimal Power flow (OPF), short circuit, etc. These tools need to operate harmoniously and independently, effectively organize the analysis results and share information amongst the various layers of the information service.

In other aspects, such as emergency control, restoration control and, etc., multiple services are required to be harmoniously interconnected into a multi-agent system to perform calculations, analyses, and be able to create actionable intelligence to support auto-pilot operation.

3.1 Novel Applications and Analysis

Adaptive protection and controls have been studied and advanced applications have been implemented in many modern day devices [5]. Advanced concepts by adjusting the need for more dependability or security as applicable, through the application of fuzzy logic and based on a set of real-time factors, such as system state index, nodal price, and equipment outage impact index have also been explored [6] and today’s technologies such as real-time simulation tools support the concepts to go beyond theoretical and even demonstration stages to “proof of concepts”.

Wide-area monitoring, protection, and control (WAMPAC) systems are emerging as a cost-effective solution to improve system planning, operation and maintenance. WAMPAC systems can take advantage of the latest advances in sensing, communication, computing, visualization, and algorithmic techniques and technologies.

Synchronized phasor measurement (PMU) technology and applications are an important element and enabler of WAMPAC, which has been receiving considerable attentions from the power industry. With its precise time synchronization, the measurement from the different locations in the system can be collected and compared in real time. It is hence ideal for monitoring and controlling the dynamic performance of a power system [7].

Implementation of this functionality gained significant attention after the 14 August 2003, Northeast Blackout event. One of the major challenges faced during the investigation was the rather limited availability of time-synchronized event recordings in the affected regions. As a result, one of the North American Electric Reliability Corporation (NERC) recommendations states: “Facilities owners shall, in accordance with regional criteria, upgrade existing dynamic recorders to include GPS time synchronization and, as necessary, install additional dynamic recorders” [8].

The applications area of phasor measurement units (PMU) can vary from visualization, postmortem analysis, state estimation improvements, congestion management, controlled system islanding, angular and voltage stability alarming and automated control, adaptive protection and relay settings, intelligent load-shedding, system restoration, etc. Application of PMUs in association with novel applications and analysis methodologies could help improve system reliability by mitigating undesirable responses of devices due to hidden failures, by monitoring system changes affecting settings and thus provide early warnings [9,10]. Additional system developments to improve reliability include:

  • Novel algorithms for disturbance monitoring devices in order to calculate the proximity to voltage instability. These smart devices are installed at designated locations such as major load centers. They use local measurements to estimate the voltage stability margin and send alarm signals to a control center when it detects a local weak condition.

  • Installation of PMUs over the critical transmission corridors to monitor reactive power transfer to the load center.

  • Transmitting computation results from different geographical locations via a communication network to a control center. This will enable on-line wide-area voltage stability monitoring and control.

  • While microprocessor relays have gained full industry acceptance, there is a large number of legacy non-intelligent relays still in operation resulting in decreased reliability due to increased maintenance and failure costs or concerns that those costs will increase. The number of electromechanical and solid-state protection devices is estimated to about 70% in the US. It is not trivial to determine end of life and probability of failure for electromechanical relays. In addition, reduction in the skill set and knowledge familiar with troubleshooting, testing, and repairs of the old technology is diminishing across the power industry spectrum (see Sect. 3.4).

There are methods to assess probability of failure and replacement needs using criteria such as: age, maintenance practices and records, industry experience with certain relay types, and criticality of failure to name a few.

However, it is not easy to justify relay upgrades entirely based on age. Reluctance to upgrade to microprocessor relays is further emphasized by the complexity associated with increased functionality (e.g. settings), need for firmware upgrades, short life span of computer technology, and overall need to change the protection system philosophy and design. In addition, while digital relays provide a wealth of data, users may be faced with data overload. It is often the case that even data already available are not used or even collected (see Sect. 3.3).

New generations of microprocessor protective relays and substation/distribution automation systems are offering lower installed cost, integrated flow of rich information for operations and management, and improved performance of system protection and security. Those benefits cannot be achieved by one-for-one replacement of old devices with new ones.

A successful strategy should focus on how to successfully integrate devices that once operated in isolation, and how to use the new functional characteristics of the latest product generations to meet new operating challenges while lowering costs and improving operations. New designs must reflect innovative ways of combining the proven functions and elements, building on what has been demonstrated to achieve enhanced operating and cost benefits. The successful strategy, leading to the most cost-effective results, needs to identify order and speed in which relays need to be upgraded [11, 12].

To protect investments for future use, it is necessary to evaluate new technology in the time frame of the upgrades. Cost savings are also achieved through technology management and standardization. For example, introducing IEC 61850 as quickly as is practical can result in future-proof solutions with additional benefits. Use of IEC 61850 could help replace control wiring, simplify integration and data flow, allow for easier engineering and design changes, and reduction in installation and O&M costs [12].

3.2 Standardization

Standardization is the key to meeting every aspect of today’s reliability and power delivery on the smart grid. Over decades of incremental upgrades, stranded asset uncertainties, social and environmental policy and regulations, and an absence of authority to enforce regulatory measures, the grid infrastructure expansions were kept to a minimum. Now, standards should be applied in all aspects of the infrastructure; from the substation design to bus configuration to control building equipment.

There is a direct relationship between grid reliability and protection and control justifying investments in standardized infrastructure system upgrades. Long-term vitality and viability is an important strategic requirement of standardization. Some of the elements of efficient and effective systematic upgrades to meet customer demand include:

  • Regulatory compliances Considerations of reliability and potential impact to bulk interconnected power system. Regional Reliability Council discussions and the resulting directions and guidelines.

  • Requirements of high-level internal strategic directions to eliminate discrete components such as control switches, interposing auxiliary devices, and metering instrumentation.

  • Flexibility, and adaptability to moving technology.

  • Familiarity and maintaining core competency skills, cohesive resource training, and resource management.

  • Benchmarking, trend settings.

  • Implementation of new technologies that bring about processes and other efficiencies, and decreases backlogged maintenance work.

Combining the notion of wide-area monitoring with standardization gives rise to the emerging technology area of sensor webs. Sensor web enablement (SWE) technology is a service oriented open standard developed by Open Geospatial Consortium (OGC) for discovery and acquisition of sensor data. SWE can integrate sensor data irrespective of physical/logical characteristics of the sensors, providing a platform for interoperability, essential in achieving seamless inter-utility communication [13].

Proper monitoring and critical information exchange in real time is a key for reliable operation in the grid. The disparity in protocols used in the power industry and lack of infrastructure of information exchange are hindrances to achieving reliability. Therefore, using the standards-based sensor web technology is one approach available for achieving interoperability in the power systems. Sensor web enablement (SWE) and common information model (CIM) provide a solution to heterogeneity of data and lack of central repository of the sensor data for proper action, in case of a contingency. The sensor data from utilities, published in CIM format, can be exposed via a sensor observation service (SOS). This provides a standard method for discovering and accessing sensor data between utilities, which facilitates the rapid response to handle contingences. In addition, the application of SWE in power industry pushes power industry one step closer towards auto-pilot operation.

3.3 Information Service

Since it is not possible to completely prevent blackouts, then effective and fast power system restoration is necessary to minimize the impact of major disturbances. This requires rapid decisions in a data-rich, but information-limited environment. The streams of data from a variety of sensors do not provide system operators with the necessary information to act on in the timeframes necessary to minimize the impact of a disturbance. Even if there are fast models that can convert the data into information, the system operator must deal with the challenge of not having a full understanding of the context of the information and, therefore, the information content cannot be used with any high degree of confidence. Some of the key elements for response in smart grids are:

  • Well-defined procedures that require overall coordination within the affected area, as well as with the neighboring grids.

  • Reliable and efficient software tools to aid operators and area coordinators in executing dynamic control procedures and in making the right decisions.

  • Control solutions reducing the overload and instability risks during recovery.

Today’s technology allows improved processes and smart systems to aid in decision-making to minimize impacts of outages (spatially and temporally). Standard operating procedures, based on pre-defined system conditions and operating parameters, can be provided via a set of power system information services. For example, rapid restoration or minimizations of outages by selected islanding are options for consideration in minimizing the consequences of an outage to a user.

Information services are focused on providing the right information at the right moment to the right decision maker. High-level operational information services (i.e., actionable intelligence) are often needed along with supportive sensor data or trends to provide context. The information services required by grid operators could vary from scenario development to estimates of socio-economic impacts of failures to quantitative statistics, trends and forecasts. These services also must be available in a geospatial context and at various temporal scales to support the needs of system operators, planners, and regulatory agencies. Information services must be characterized by a strong integration of grid data with ancillary data and information, and this will require a knowledge-based approach for capturing the best practices of utilities and regulators. The complexity of these information services will require a network of partners who will contribute to the production of the services. To facilitate these services it will be incumbent upon the power research community to develop tools to facilitate operational data acquisition and handling in interoperable formats and to create information products through a coordinated process chain. The successful conversion of power sensor data into actionable intelligence will require the integration of power system expertise in modeling, data management and service delivery to describe the state of the grid and to predict responses to actual and potential change.

3.4 Education

The continuation of the technology explosion of the second half of the twentieth century requires the availability of a diverse and highly capable technical workforce. Unfortunately, the education of engineers has not kept pace with the global demand. As a result there is a tremendous shortage of technical personnel all around the world. In the context of globalization this is a complex challenge and the cooperative efforts among stakeholders are required [14].

The US Department of Energy (DOE) and the North American Electric Reliability Corp. (NERC) identified the aging workforce as a critical challenge facing the electric power industry and the educational system that supports it. If not managed properly, the loss of experience and expertise will affect reliability, safety, productivity, innovation, and the capability to solve pressing issues, such as grid modernization and climate change.

The aging of the American workforce has emerged as a critical issue facing American productivity in the 21st century. As the so-called “Baby Boomer Generation” reaches retirement eligibility, the impact will be felt across both the public and private sectors. These 78 million individuals born between 1946 and 1964 have accumulated a wealth of experience and knowledge, and represent 44% of America’s workforce. For electric utilities, whose service quality and reliability depends on maintaining an adequate, knowledgeable workforce, managing the upcoming retirement transition is a particular challenge [15].

The reliability of the North American electric utility grid is dependent on the accumulated experience and technical expertise of those who design and operate the system. As the rapidly aging workforce leaves the industry over the next five to ten years, the challenge to the electric utility industry will be to fill this void… [16].

The education of engineers has not kept pace with the technological developments. The universities cover very few classes in power systems in undergraduate programs and practical experience in signal processing and advance feedback control systems are needed to bring the practical knowledge to the universities. Though it is a science that can be covered through sound basic principles, its actual implementation permits alternatives. The alternative that is selected depends upon the power engineer’s experience and the traditions of the electric utility company. Indeed, the entire power engineering education curriculum is at a crossroads and needs complete rejuvenation. Experience to date has shown that students can be attracted to and retained in power programs if they are exposed early to the joys of creation through design, discovery through research and invention through hands-on experimentation [17].

The paper by [14] gives several examples of how universities are working with industry and government to develop novel approaches in fostering power engineering education which is a lynchpin in the grid reliability quest.

4 Next Steps

This chapter presents a vision and transformation blueprint for meeting the protection and control needs of the twenty-first century to generate and deliver reliable power in the smart grid. The roadmap includes new concepts in use of modern tools and techniques as well as hardware and applications. Protection and control markers such as resource and asset management, process for harmonization of different plans and disciplines to a united vision, and justification strategies and benefits of investments are highlighted. Use of modern technology ad methods of testing and detecting equipment or design failures are highlighted.

Some of the concepts suggested in this paper about utilizing information services and integrating data from a myriad of sensors will be required to maintain social and environmental obligations for the electric utility industry. The protection and control will be instrumental in achieving reliability, efficiency, and financial aspects of the twenty-first century grid.

Exchange of information stemming from the worldwide experiences and the innovations in technology shed new lights on the current conditions, procedures, regulations and design of power systems of the future. Examination of the root causes for blackouts, for example, the resulting effects on neighboring systems, and implementation of proven solutions to help prevent propagation of such large-scale events should help design reliable power delivery infrastructures for today and in the future.