Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Physical-engineered critical infrastructures (CIs) are characterized as large scale, spatially distributed, complex networks—either open or closed. According to Dueñas-Osorio and Vemuru (2009), these systems are made of “a large number of interacting components (real or virtual), show emergent properties difficult to anticipate from the knowledge of single components, are characterized by a large degree of adaptability to absorb random disruptions and are highly vulnerable to widespread failure under adverse conditions.” Indeed, small perturbations can trigger cascades and large-scale consequences in CIs; furthermore, disruptions may also be caused by targeted malicious attacks.

2.1 Complexity

A recent National Science Foundation (NSF) workshop report (Guckenheimer and Ottino 2008) points at the fact that a complex system is characterized by an internal structure which may consist, besides many interacting components, of “a network that describes which components of the system interact, multiple scales of space and/or time, or symmetry. The components of many complex systems are heterogeneous and form a hierarchy of subsystems.” Furthermore, uncertainty is regarded as pervasive in complex systems, and its characterization and propagation through the system as key aspects for the reliable prediction of the system behavior and its effect and safe control.

The above attributes draw the boundary between simple and complex systems. Less trivial is to draw a boundary between complicated and complex systems. Table 2.1 attempts to do so by highlighting the very essence of a complex system, which is believed to lie in the degree and modality the parts interact and the overall behavior of the system that emerges from these. “The system must be analyzed as a whole; decomposing the system and analyzing subsystems does not necessarily give a clue as to the behavior of the whole” (Guckenheimer and Ottino 2008).

Table 2.1 Characteristics of complicated versus complex systems, both entailing a large number of highly connected components

2.2 Learning from Experience

Despite Cassandra, CIs have proved highly reliable in and beneficial for Western societies. Nevertheless major breakdowns have occurred, illustrating the complexity of system behavior and of the event sequences which may generate, and showing the negative consequences of dependencies leading to cascading effects.

In the electrical transmission CIs, for example, the analysis of recent major blackouts from 2003 to 2006 (Table 2.2) leads to drawing some conclusions on the main underlying causes and to carving some patterns of common behavior:

Table 2.2 Recent major blackouts of electric power supply systems
  • Technical failures (Denmark/Sweden, two independent failures), external impacts (Tokyo, construction work; Brazil, extreme weather conditions) and adverse behavior of protective devices (London) are important triggering events, when not protected by the N-1 security criterionFootnote 1 and/or in combination with high-load conditions (Moscow).

  • Organizational factors such as market liberalization and short-term contracting causing operation of the system beyond original design parameters (e.g., Great Lakes, Italy), and stressing operation conditions such as weakening maintenance work and/or inadequate integration of intermittent power generation (e.g., Western Europe) have proven to be outstanding causes.

  • As the transmission system operators (TSOs) play a decisive role with regard to contingency management, lack of situational awareness and short-term preparedness, as well as limited real-time monitoring beyond control areas and poorly timed cross-border coordination (e.g., Great Lakes, Italy, Switzerland(rail)) build up as aggravating factors.

  • The inadequacy of the N-1 security criterion and, even more importantly, of its inadequate evaluation/implementation in various cases have enforced attempts to make it more stringent and legally binding.

Also, lack of investment due to increasing economic pressure, public resistance, etc., can be observed in many countries and areas, leading to insufficient system monitoring, control, and automation as well as to insufficient grid extension and maintenance (including tree cutting programs [Great Lakes, Switzerland/Italy]), and thus contributing significantly to past blackouts.

As expected, disruption of electricity supply strongly affects our society and other infrastructures which depend on it. The Italian electric power blackout on September 28, 2003 at 3.01 a.m. (Sunday/Monday night) may serve as an example to further elucidate the course of events and delineate the associated consequences:

At the given date, one of the main north–south transit lines through Switzerland—the Lukmanier transmission line—shut down following a flashover between a conductor cable and a tree. This resulted in a redistribution of the electricity in accordance with the laws of physics, and a subsequent overload (110%) of another north–south transit line, namely the San Bernardino transmission line, which due to another flashover also shut down at 3.25 a.m. What followed was a series of cascading failures of other transmission lines in the border region. At that time, the Italian grid was completely separated from the UCTEFootnote 2 network. Despite primary frequency control, automatic disconnection of the pump storage plants in Northern Italy, and automatic load shedding (10 GW), the voltage and frequency drop within the Italian grid could not be mastered and generation plants started to trip. This in turn gave rise to a total blackout throughout Italy at 3.27 a.m. (except Sardinia).

After 3 hours, energy was restored in some regions connected to France (such as Liguria). Nine hours later, in the afternoon of September 28, electricity was restored gradually in most places, including Turin, Milan, Venice, and Rome. The energy not supplied due to the rotating outages totaled 12.9 GWh. Rolling blackouts continued to affect about 5% of the population on the next 2 days (September 29–30) as the electricity company ENEL continued its effort to restore supply. Restoring power to the whole country took 18 hours. As a consequence, other infrastructure sectors were affected showing their strong dependence on electricity supply (Fig. 2.1). The effects on the population, the economy and other infrastructures—in greater detail—are given in Table 2.3.

Fig. 2.1
figure 1

Impact of the Italian blackout on other infrastructure sectors

Table 2.3 Effects of the Italian blackout, September 28, 2003

The role of failure cascades and (inter)dependencies among infrastructures (see Chap. 1 and Sect. 2.3) is highlighted in the real examples listed in Table 2.4. Details of the mini telecommunication blackout in Rome, Tor Pagnotta Street, on January 2, 2004 at 5.30 a.m., demonstrate the challenges to (inter)dependency analysis:

Table 2.4 Examples of the importance of interdependencies between critical infrastructures (based on ETH–Laboratory for Safety Analysis 2008)

Flooding of a Telecom Italia major telecommunication service node occurred when a metallic pipe carrying cooling water for the air conditioning plant broke. The flooding led to several boards/devices failing due to short circuits and the main power supply going out of service. Diesel generators, part of the emergency power supply, failed to start due to the presence of water; only batteries provided power to the boards/devices still working, but finally dropped.

The Fire Brigade arrived at 7.30 a.m. and worked for pumping out the flooding water and finally individuating the point of the pipe breakage. To start the repair actions, technicians had to shut down the air conditioning plant. The mini blackout caused problems and delays in different infrastructures, including Fiumicino airport, ANSI print agency, post offices and banks, ACEA power distribution and the communication network (both fix-to-fix and fix-to-mobile), connecting the main Italian research institutions (Fig. 2.2).

Fig. 2.2
figure 2

Infrastructures affected by the mini telecommunication blackout in Rome, 2004

As mentioned before, the Telco mini blackout also impacted services of ACEA electrical distribution power grid. ACEA has two Control Centres: manned Main Control Centre (Ostiense) and unmanned Disaster Recovery Control Centre (Flaminia). All the tele-measures, commands, and alarms managed by the unmanned control centre are dispatched to the manned one using two redundant TELCO communication links at 2 Mbits/s. One is the main link, the other one a backup link that is always in stand-by state. Such links were expected to be located on two different geographical paths. Due to a maintenance operation, both links were traversing the same flooded node. Therefore, both links were out of service during the blackout. As a consequence, there was no chance to exchange alarms and signals on the status of power distribution network, and commands between the unmanned centre and the manned one. In such a situation, ACEA completely lost the monitoring and control of all the remote substations managed by the unmanned control centre for a total of 1 h and 23 min. The difficulty of the manual diagnostic and recovery actions by ACEA operators were further increased due to partial out-of-service of fixed and mobile phones.

Fortunately, conditions were very favorable during the blackout so that the power grid required no control actions from ACEA Control Centres to its Remote Terminal Unit, within the duration of the Telco mini blackout.

2.3 Dimensions of Interdependencies

As clearly demonstrated by experienced events, dependencies and in particular interdependencies bear significant practical relevance rather than being a (fairly) new theoretical concept (Rinaldi et al. 2001) introduced six dimensions for their description, and a categorization into four general “types of interdependencies” (Fig. 2.3):

Fig. 2.3
figure 3

Dimensions for describing infrastructure interdependencies (Rinaldi et al. 2001)

  • Physical interdependencies—the state of each is dependent on the material output(s)/flows(s) of the other, e.g., a pipeline network provides gas to fuel a gas-fired power station while the electricity generated is used to power compressors and controls the gas supply network.

  • Geographic interdependencies—elements are in close spatial proximity and a local environmental event affects components across multiple infrastructures, e.g., earthquake, flooding or a fire.

  • Cyber interdependencies—connect infrastructures to one another via electronic, informational links, e.g., a supervisory control and data acquisition (SCADA) system monitors and controls elements of the electric power grid—likewise, it may provide pieces of information or intelligence supporting another infrastructure or a decision-making process elsewhere.

  • Logical interdependencies—exist between infrastructures that do not fall into one of the above categories.

The “couplingFootnote 3 and response behavior” of interdependent CI deserves special attention, as it directly influences whether the infrastructures are adaptive or inflexible when perturbed or stressed. (Rinaldi et al. 2001) introduce three primary coupling characteristics:

  • The degree of coupling either tight or loose addressing the nature of correlation of a disturbance in one agent to those in another; e.g., the gas-fired spatial heating system without storage is closely coupled to the gas supply system without “time to give” or slack.

  • The coupling order either directly connected (first-order effect) or indirectly through one or more intervening infrastructures (second-order up to n-order effects); e.g., loss of electric power may directly affect the pumps and control of the spatial heating system directly and indirectly affects the fuel supply via the compressors of the gas supply system, if they are electrically driven.

  • The linearity or non-linearity/complexity of the interaction; i.e., whether or not agents can interact with other agents outside the normal scheme, or production, or operational sequence, not intended by design as being subtle and difficult to detect, showing unfamiliar feedback loops; e.g., a large scale areal event such as extreme heat affecting various agents simultaneously.

Figure 2.4 depicts all this by taking the prolonged electric power problems in California as the basis. Elements of other dimensions, in particular “environment” (business and economic [deregulation], legal and regulatory [public policy]) and “type of failure” (subset of common causes) might be added to the four above categories of types of interdependencies, or used to specify “logical interdependencies”. The latter has been proposed by (Pederson et al. 2006) who slightly expanded the above taxonomy from physical, cyber (renamed informational), and geographic (renamed geospatial) to policy/procedural, and societal interdependencies. These types of interdependencies carry a state, or consequences from events in one infrastructure to other infrastructures, although no direct linkage or relationship in a physical sense exists; halt of air traffic for more than 24 h in the US and air traffic drop worldwide following the “September 11 attack” may serve as reference examples.

Fig. 2.4
figure 4

Examples of nth-order interdependencies and effects (Rinaldi et al. 2001)

Relating to physical-engineered CIs, the six dimensions proposed by (Rinaldi et al. 2001) still seem to be appropriate to facilitate the identification, understanding, and analysis of interdependencies as well as of dependencies, and to frame the requirements for modeling and simulation approaches. A multiple/combined (rather than “silver bullet” single) approach is needed to address, in a consistent manner, all of these interrelated factors and system environment classes/attributes, respectively.

The following extensions to the six dimensions and related elements are here proposed (see Fig. 2.5 as a modification of Fig. 2.3) to strengthen their representation:

Fig. 2.5
figure 5

Dimensions for describing infrastructure interdependencies (according to Rinaldi et al. 2001), modified by authors in italic)

State of operation”::

“normal”—distinction between nominal, peak, off peak

“repair” extended to maintenance during continuous operation or down states

Type of failure”::

“common cause initiating” added to “common cause”

Types of interdependencies”::

“cyber” changed to informational including hard- and software

“geographical” changed to geospatial

Note “logical” includes lacking diversity, functional, etc.

Environment”::

“Speed of developments/changes” added

In general, it is difficult to clearly define the boundaries of individual infrastructures and model (inter)dependencies among them adequately. Often, infrastructures are decomposed into a physical and supporting/controlling part and modeled in a linear fashion. Let us take the electrical power grid (physical system under control) and the monitoring and control (SCADA) system as examples. Only if the SCADA system is dedicated, does not make use of commercial systems to transfer data and commands, e.g., the open Internet, and does not incorporate common hardware and software, then it can be modeled as part of the electric power infrastructure, including dependencies within. If not, as it is obviously the case in many countries, it would be closely linked to the information and communication infrastructure and must be modeled in a less simplistic way, “not to overlook the true complex nature of interconnected infrastructures” (Rinaldi et al. 2001). Taking this into account, many vulnerability analysts call for a “system-of-systems” approach.

Failures (negative impact) that arise from (inter-)dependencies can be classified as follows:

  1. (1)

    One event causing failure or loss of service of more than one infrastructure, such as areal external events (earthquakes, floods, extreme weather conditions, etc.), due to spatial proximity (called common cause initiating events).

  2. (2)

    Failure of one infrastructure causing failure or loss of service of at least another infrastructure, e.g., rupture of mains of the water supply system (called cascade initiating events).

  3. (3)

    Failure or loss of service resulting from an event in another infrastructure, e.g., failure of gas lines due to loss of main electricity supply if compressors are electronically driven (called cascade resulting events).

  4. (4)

    Failure or loss of service of one infrastructure escalating “domino effect” because of failure of another affected infrastructure, e.g., failure of the electric power system leading to failure of the SCADA system and by this affecting restoration of the electric power system (called escalating events ).

Events being neither one of these four types maybe called independent. The types of non-independent events are not mutually exclusive.

2.4 Empirical Investigations on Critical Infrastructure (Inter)Dependencies

As outlined before, interdependencies within and among CIs are recognized as both opportunities, e.g., increased coping capacity and elements of increased vulnerability. With regard to the latter, empirical studies have been made and published focusing on building databases of different kinds and/or to obtain findings for decision making.

For example, the database in (Luiijf et al. 2009) was built from public reports of disruptions of CIs from open sources like newspapers and Internet news outlets. The following results were derived from a subset of 1,749 failure incidents in 29 European countries (95% occurred after 2000) with noticeable effect to society (e.g., at least 100,000 electric power customers affected). Events have been classified as “cascade initiating”, “cascade resulting,” and “independent” (see previous section). The results disclose that:

  • “Cascades resulting” events are more frequent than anecdotally thought, i.e., almost 30% (501 out of 1,749) of the reported incidents result from incidents in other services, and that “cascade initiators” are about half as frequent (268 out of 1,749, see Table 2.5).

    Table 2.5 Categorization of number of CI disruption events (Luiijf et al. 2009)
  • The dependency matrix is sparsely populated and cascades are highly asymmetrical; the dependencies are more focused and directional than often thought.

  • Energy and telecommunication are the main cascading initiating sectors, energy is the only sector which initiates more cascades than it ends up receiving (146 versus 76, see Table 2.5).

  • Within the energy sector 61 (out of 65) dependencies exist between the electrical power subsector services, and within telecommunication services disruptions of telecom backbones most seriously affect Internet services (see Table 2.6).

    Table 2.6 Events categorized by initiating and affected sector (number of events) (Luiijf et al. 2009)

With regard to escalation of cascades, the analysis shows that a cascade initiating event in the energy sector triggers 2.06 disruptions of other services, but taking all events into account (including independent) only one out of two events triggers a disruption of another CI. For the telecommunication sector the respective numbers are 1.86 disruptions and two out of five. Interestingly, 421 events (out of 501 resulting in cascades) are first level, 76 are second level, and 4 are third level cascades; no deeper cascades have been found. This contradicts the results of the evaluation of selected experienced events (Table 2.4) and may be due to a bias of the open source media information consulted.

Another fairly simple measure of (inter)dependency was proposed by Zimmermann (2004) which compares the duration of outages in the initial system disruption, e.g., power outage, with the duration of outages of specific public services and businesses affected. The investigations based on power outage data from North America from 1990 through 2004. Results showed that the duration of outages linked to the electricity outage for affected public services exceeded the duration of the initial power outage itself pointing to the fact that cascading events did escalate.

2.5 Degree of Criticality

The definition(s) of CI (this chapter) focus on systems and assets which are considered critical with regards to some criteria, possibly varying from one country/region specific definition and associated perspective to another. These definitions (Table 2.7) are descriptive rather than a precise yardstick to objectively assess criticality and its degree. There is no standardized usage and broad-based mutual understanding of what criticality is and how to measure it (see also Bouchon 2006). Nevertheless, the description of conditions for a critical situation focuses on a disturbance or loss of continuous (reliable) service and places the analysis of CIs in the realm of the analysis of reliability (systems view) or availability (users view), of risks for the owner/operator and/or the public due to adverse events, and of vulnerability of the system and/or the society.

Table 2.7 Elements for criticality definition

Infrastructures are considered highly critical because of being both a trigger of a potential crisis and a means to resolve a crisis/emergency situation; the telecommunication systems may illustrate this. Therefore, criticality objectives are primarily related to strategic objectives of a state entity although other subjective standpoints, i.e., of owners/operators, insurers, other stakeholders and the general public, are worth mentioning and may lead to other criticality criteria. Referring to the Swiss Federal program for critical infrastructure protection (CIP) as an example, the “criticality of an infrastructure” denotes its relative importance regarding the potential consequences of a disturbance, functional deficiency, or destruction for the public and its vital resources” (Federal Office of Civil Protection 2009). The probability of such an event is deliberately not regarded as important.

Attempts have been made to further specify criticality and to distinguish degrees of criticality. The European Commission—EC (EC 2004) has proposed to distinguish three criteria—within its concept of CIP focused on the fight against terrorism:

  • Scope: The loss of a critical infrastructure element is rated by the extent of the geographic area which could be affected by its loss or unavailability, i.e., international, national, provincial/territorial, or local.

  • Magnitude: The degree of the impact of loss can be assessed as none, minimal, moderate, or major. Among the criteria which could be used to assess potential magnitude are:

    1. (a)

      Public impact (amount of population affected, loss of life, medical illness, serious injury, evacuation)

    2. (b)

      Economic (GDP effect, significance of economic loss and/or degradation of products or services)

    3. (c)

      Environmental (impact on the public and surrounding location)

    4. (d)

      Interdependency (among other critical infrastructure elements)

    5. (e)

      Political (confidence in the ability of government)

  • Effects of time: This criteria ascertains at what point in time the loss of an element could have a serious impact (i.e., immediate, 24–28 h, 1 week, other).

This has been taken up by the International Risk Governance Council (IRGC) to assess the degree of criticality of five coupled physical-engineered infrastructures. This was done semi-quantitatively, based on expert judgment and screening analysis applying the so called traffic light model (IRGC 2006); see Fig. 2.6 for a snapshot of the results. Sometimes it might be of interest to distinguish between objectives, e.g., economy/economic security, public health and safety, and to address interdependencies separately.

Fig. 2.6
figure 6

Example for the evaluation of the degree of criticality (IRGC 2006)

Taking again the Swiss CIP program as the example, it was proposed to assess the criticality at the level of 31 subsectorsFootnote 4 ‘according to their “relative importance” (see definition above). The main purpose is to give steer to more detailed analyses aiming at the identification of critical elements in prioritized infrastructures (called vertical criticality) and to strategic planning (Federal Office of Civil Protection 2008). Three criteria are distinguished:

  • Effect on other subsectors (dependence)

  • Effect on the public

  • Effect on the economy

Four categories have been established to specify the effect (consequences)—from none (0), small (1), median (2) to large (3). It is assumed that the “loss of continuous service” will take place without pre-warning, will last approximately 3 weeks, and will affect the whole country. A spider diagram is used to illustrate the results (Fig. 2.7): electric power supply and telecommunication are highest ranked.

Fig. 2.7
figure 7

Criticality of subsectors (Federal Office of Civil Protection 2008)

Activities are undergoing at the level of the European Union aimed at identifying “European critical infrastructures” (EC 2008) by taking three aspects into account:

  • Risk of casualties (number of fatalities and/or injuries)

  • Economic loss (percentage of GDP)

  • Public effect (number of people affected)

To be regarded “critical” for the European Union at least two Member States must be significantly affected by a loss of service of the infrastructures under consideration (currently the energy [electric power, oil and gas supply] and transport sectors). As in all other cases known to the authors, definition and assessment of criticality focus on loss of service; misuse of infrastructure to intentionally cause harm to the public, economy, and government (“weaponizing”) is not taken into consideration; penetration into unmonitored parts of the urban drinking water system and dumping of hazardous substances may serve as fictitious example for illustration.