The capacity to respond to the diversity of situations that may arise is one of the cornerstones of safety management’s Resilience Engineering perspective. This chapter focuses on the description of a framework aiming to collect and analyze data for supporting its assessment and the proposal of corrective actions. Resilience Engineering’s theoretical background endorses the definition of performance indicators. Individual and collective interviews help the identification of factors to be corrected and others to be preserved.

1 Introduction

The capacity to respond to the diversity of situations that may occur is one of the critical cornerstones of resilience engineering (Hollnagel, 2011). To be considered resilient when facing an abnormal condition, agents have to adjust their behavior to prevent unwanted outcomes and continue accomplishing their duties according to their model of performance. Depending on the situation’s nature, agents adapt their behavior by considering their experience, rules and procedures, leadership, or improvisation.

A review of the development of resilience metrics in the railway domain (Besinovic, 2020) demonstrates that resilience metrics are developed to support the network’s robustness to disturbances and support the optimization of the train schedule and reschedule. Ferreira (2011) applied the Resilience Engineering perspective to railway planning activities experiments in the Resilience Engineering domain. Siegel and Schraagen (2014) propose a so-called resilience state model for railway systems adapted from Rasmussen’s (1997) system boundaries and Woods and Wreathall's (2008) stress-strain model. De Regt, Siegel, and Schraagen (2016) propose metrics to quantify weak resilience signals.

The framework proposed in this chapter focuses on the sociotechnical system’s capacity to respond and aims to support its formalization and its assessment by identifying essential factors to be preserved and vulnerability factors to be corrected. The first section describes the theoretical background shaping the development and the different phases that structure the framework’s application. The following sections detail them. Finally, the last part describes a synthesis of the results of its implementation.

2 Rationale for the Overall Approach

The framework helps identify and handle gaps and needs related to a system’s ability to respond to the diversity of situations that may arise in a systematic and structured manner (Rigaud et al. 2013, 2018). Safety managers can use the framework to enhance their understanding of the system’s complexity and structure learning, training, and change management activities to improve their operations’ security. They can apply it at a different scale (technological system, process, unit, plant).

3 Basis and Sources of the Framework

The Resilience Engineering perspective on safety management structures the framework. Borys, Else, and Leggett (2009) consider Resilience Engineering as the fifth age of safety. This period follows a phase of integration (Glendon et al., 2006) of technical, human, managerial, and cultural factors in risk management practices (Hale & Hovden, 1998).

Douglas and Wildavsky (1983) consider that no one can know and predict all the potential risks and associated consequences. Risks are selected using rational and irrational criteria. However, even within the scientific community, there is rarely a consensus regarding potential risks and accompanying problems. The Resilience Engineering perspective aims to endow systems with the requisite imagination to respond and overcome the diversity of situations that can occur (Adamski & Westrum, 2003, Woods & Hollnagel, 2006). The aim is to change the main focus of safety management from the prevention of risks to the development of workers’ adaptive capacity to be in control despite the variability and the complexity of situations and the lack of time, knowledge, competence, or resources (Hollnagel & Woods, 2006). The target is the development of the resilience of systems. Resilience refers to the “intrinsic ability of a system to adjust its functioning before, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected situations” (Hollnagel, 2011). It also refers to the “ability to recognize and adapt to handle unanticipated perturbations that call into question the model of competence, and demand a shift of processes, strategies, and coordination” (Woods, 2006).

The capacity to respond to regular and irregular variability, disturbances, and opportunities either by adjusting the way things are done or activating readymade responses is one of the four essential capacities that structure the conceptualization of system resilience (Hollnagel, 2011). The three others are the capacity to monitor changes, the capacity to anticipate developments, threats, and opportunities, and the ability to learn the right lessons from the right experience.

4 Theoretical Background

The framework’s theoretical background is composed of seven situations aiming at describing the diversity of conditions that can occur within the system for supporting data collection and of two performance indicators aiming at supporting the assessment of the system performance.

5 Situations of Resilience

Assessing the capacity to respond following the Resilience Engineering perspective on safety management requires considering different situations and associated adaptive behavior. Five variables structure the definition of these situations:

  • The type of adverse situations. Firstly, the adverse situation classification considers if the system finds them normal or abnormal, and secondly, their predictability. Thus, the typology considers four types: normal situation, regular abnormal situation, abnormal irregular situation, and exceptional/unexampled situation.

  • Adaptive processes. The functions considered for describing the adaptive processes aiming at responding to the different situations are: 1) event detection, 2) situation recognition, 3) decision to act, 4) definition of the behavior, 5) mobilization of resources, 6) act.

  • Existence of good practices and/or procedures. For each adaptive process identified, the existence of good practices and/or procedures is considered.

  • Context of action. The context of action is related to the difference between competence, knowledge, resources, and time required to perform adequately adaptive processes identified and the competence, knowledge, resources, and time available.

  • Performance model. The criteria used for assessing the system performance are quality, reliability, safety, security, sustainability, etc.

The variables induce seven situations of resilience to consider when collecting data and assessing the capacity to respond:

  1. 1.

    The situation is normal, considered by procedure or good practices, and the context (time, knowledge, competencies, and information) necessary to respond is available. Agents can recognize the situation, define their future behavior by using their experience or by adapting a known and regularly applied procedure, and apply it in conformity with all the dimensions of performance of the activity.

  2. 2.

    The situation is normal, considered by procedure or good practices. However, the context (time, knowledge, competencies, information) necessary to respond is not available. Agents can recognize the situation, define their future behavior by using their experience or adapting a known and regularly applied procedure and apply it with creativity to conform with all dimensions of the activity’s performance despite the lack of one kind of resource.

  3. 3.

    The situation is normal and not considered by procedure or good practices. Agents can recognize the situation and that neither procedure nor good practices support them to define the behavior to adopt, they are creative to define their future behavior and apply it in conformity with all dimensions of performance of the activity.

  4. 4.

    The situation is abnormal (perturbation, crisis), considered by procedure or good practices, and the context (time, knowledge, competencies, and information) necessary to respond is available. Agents can recognize the situation and the necessity to adopt a non-routine behavior; they define their future behavior by using their experience or by adapting a known procedure or find one in a guideline, they apply it in conformity with all the dimensions of performance of the activity in contributing to the continuity of the activity of the system.

  5. 5.

    The situation is abnormal, considered by procedure or good practices, but the context (time, knowledge, competencies, information) necessary to respond is not available. Agents can recognize the situation and the necessity to adopt a non-routine behavior; they define their future behavior by using their experience or by adapting a known procedure or find one in a guideline, and apply it with creativity in order to conform with all dimension of performance of the activity despite the lack of one kind of resources.

  6. 6.

    The situation is abnormal and not considered by procedure or good practices. Agents can recognize the situation and the necessity to adopt a non-routine behavior. Neither procedure nor good practices support them to define the behavior to adopt. They are creative in defining their future behavior and apply it in conformity with all dimensions of the activity’s performance.

  7. 7.

    The situation is unexampled. Agents are creative to respond and to contribute to the continuity of activity of the system.

6 Performance Indicators

The seven situations will support data collection and system description. Two performance indicators support the evaluation of the system’s capacity to respond.

The first indicator is related to the capacity of operational agents to adjust their procedural or methodological framework or be creative to carry out their regular activity despite the variability of their environment while respecting the temporal, economic, and activity-specific performance criteria. Four rules structure the indicator:

  1. 1.

    Agents know their work and associated performance criteria.

  2. 2.

    They have the skills or know the procedures to follow and have the resources, time, and information to follow the different performance criteria.

  3. 3.

    If they lack skills, resources, time, or information, they can be creative in carrying out their work according to performance criteria.

  4. 4.

    If the situation changes and the procedural framework is no longer applicable, they can be creative enough to carry out their work following performance criteria and have the necessary maneuver margins.

The second indicator is related to the capacity of operational agents to adjust their normative or methodological framework or to be creative in order to face and overcome the occurrence of an urgent or unexpected situation, anticipated or not while respecting the temporal, economic, and activity-specific performance criteria.

The four rules associated with the indicator are:

  1. 1.

    Agents are aware of the abnormal situations, the behavior to adopt when they occur, or what document to consult.

  2. 2.

    They have the skills, resources, time, and information to respond to the situation following the different performance dimensions.

  3. 3.

    If they lack skills, resources, time, or information, they can be creative in responding to the situation following the different performance dimensions.

  4. 4.

    If the situation changes and the procedural framework is no longer applicable, or there is no procedural framework, they can be creative in responding to the situation.

This conceptual background supports the application of the framework. The following section describes the different phases to follow for conducting the assessment.

7 Key Elements of the Framework

The framework’s application consists of conducting workshops, individual interviews, focus groups, and observations for collecting, analyzing, and presenting data related to the performance of resilience, factors to be preserved and corrected, and action plans for developing resilience. The framework is a modular system of different elements (Module 1–4). The complete process model is only needed in case of the first implementation in the organization. The framework consists of four modules:

  1. 1.

    Definition of the context of the study. The first step involves clearly defining the goal and scope of the study. The team organized workshops for describing the system studied, the diversity of events it has to respond to, and its capacity to respond. An assessment methodology and associated supportive material (diagnostic schedule, interviews and observation guidelines, assessment grid) are derivates from this context.

  2. 2.

    Data collection. The second step aims at collecting data related to the system’s capacity to respond. The team conducts individual and collective interviews and observations for collecting qualitative and quantitative data about the system structure and dynamic in regular times and when disturbances occur by considering the different actors of the system (operational, managers, and directors).

  3. 3.

    Diagnosis. The third step consists of analyzing data collected to provide a resilience score and a list of factors to be preserved and corrected.

  4. 4.

    Definition of an action plan. The fourth step consists of providing a set of actions to develop resilience by correcting negative factors and highlighting and preserving positive factors.

8 Roles and Responsibilities

A set of essential roles supports the distribution of responsibilities when applying the method, considering that one person can assume different roles.

  • The “evaluation owner” is the person who is mainly responsible for the system to be assessed. This critical role encompasses the following responsibilities: defining the goal and scope of the evaluation process, supporting the assessment team in providing access to the agents of the system, and to document resources needed by the assessment (room, material, etc.).

  • The “evaluation coordinator” is the person who is mainly responsible for the evaluation process. The evaluation coordinator should cover the following responsibilities: defining the target, the scope, and the objective of the evaluation process with the “evaluation owner,” planning the different steps of the assessment, monitoring the realization of the different steps, managing issues when performing the different steps.

  • The “stakeholder coordinator” is the person who is mainly responsible for the coordination with the various agents involved in the assessment. The stakeholder coordinator should cover the following responsibilities: identifying the agents, invite the agents to workshops, provide feedbacks of the assessment to the agents.

  • The “technical coordinator” is the person who is mainly responsible for the realization of the assessment task. The evaluation coordinator should cover the following responsibilities: organizing and animating workshops, writing deliverables.

The following sections describe the rationale of the four modules. They describe the objective of the phase and practical information to conduct associated workshops.

8.1 Data Sought and Reason(S) for Choosing

The first step aims at defining the context of the assessment process. It involves: (1) defining the goal and scope of the study, (2) describing the resilience of the system assessed, (3) organizing the assessment. The team in charge of the assessment organizes workshops for achieving these tasks. The following section presents the different workshops.

Defining the goal and scope of the study : The first task to achieve consists in the definition of the general context of the assessment process. The “evaluation owner” and the “evaluation coordinator” define the goal and the scope of the study (cf. Table 1)

Table 1 Goal and scope definition task

Describing the system resilience : The second task to achieve consists in the description of the system to be studied and its associated capacity to respond to performance. The “evaluation coordinator” assisted by the “technical coordinator” describes the system and defines the capacity to respond to capacity performance by considering the context of the study (cf. Table 2).

Table 2 System resilience description task

Organizing the assessment : The third task to achieve consists of the definition of the assessment project by planning the assessment, assigning roles and responsibilities, and designing material (cf. Table 3)

Table 3 Organizing the assessment task

After achieving the three steps, all the elements to conduct the data collection and the assessment are available.

8.2 Data Collection

The second step aims at collecting the data required to proceed with the performance assessment. It involves: (1) presenting the assessment context, aim and methodology and (2) collecting data.

  • Present the assessment context, aim and methodology : The first task to achieve consists of explaining to the stakeholder the context, the objective, and the organization of the assessment (cf. Table 4)

Table 4 Present the assessment context, aim, and methodology task
  • Collect data : The data collection process aims at collecting information required to proceed with the performance assessment (cf. Table 5)

Table 5 Collect data task
  • Data analyses The third step consists in analyzing data collected to provide a resilience score and a list of factors to be preserved and to be corrected. It involves (1) indicator’s evaluation, (2) formalization of factors of resilience and vulnerability, and, (3) writing of the preliminary report.

  • Indicator’s evaluation: When the assessment team achieved the data collection phase, they proceed to the analysis of data. This process consists of rating two indicators with the support of the data collected (Table 6).

Table 6 Assess indicators task
  • Formalization of factors of resilience and vulnerability: Besides, the assessment team formalizes two lists of factors. The first list is labelled “resilience factor”; the second list is named “vulnerability factor.” (Table 7)

Table 7 Formalization of factors of resilience and vulnerability task

8.3 Writing and Validation of the Resulting Report

The third task of this phase consist in writing a preliminary version of an assessment report and validating it with stakeholders (Table 8)

Table 8 Report writing and validation task

8.4 Recommendations and Action Plans

The fourth step consists of providing a set of actions to be performed to develop the performance of resilience by correcting negative factors and highlighting and preserving positive factors. It involves (1) hierarchization of the resilience and vulnerability factors and (2) actions identification.

  • Hierarchization of the resilience and vulnerability factors: The first task of this phase consists in ranking resilience and vulnerability factors (Table 9).

Table 9 Resilience and vulnerability factors task
  • Identification of actions: The second task consists in defining short- and long-term actions for preserving resilience factors and correcting vulnerability factors (Table 10).

Table 10 Action’s identification task

Finally, the assessment team provides the final report presenting the system studied, the methodology followed, the results of the assessment, and the action plan defined.

9 Lessons from the Application of the Framework

This section presents the lessons of the application of the framework in the railway industry. The evaluation owner was responsible for the station’s train traffic. The coordinator was a railway expert in human factors. The stakeholder coordinator was the manager of train departure/arrival processes, and the technical coordinator was an expert in resilience engineering. This team collaborates to accomplish the four phases of the study.

  • Definition of the context of the study: The motivation of the study was to experiment with a resilience engineering-based assessment to identify the added value related to human factor assessment. The system studied was train departure and arrival processes. These tasks involve operational agents, first-line managers, and safety managers. Schedule, procedures, and time constraints structure the process. Injuries may occur, and events happening in the station, and in the network, might affect its functioning.

After a set of preliminary interviews, the team adapts the generic performance indicators. It produces a questionnaire aiming at collecting qualitative data aiming to describe the diversity and the complexity of the capacity to respond to the unwanted situation of the train departure and arrival processes.

  • Data collection: The technical coordinator interviews eight operational agents, six-team leaders’ representatives of the different tasks of the departure/arrival process, the head of the safety management department, and the head of traffic management in the station, with a specific questionnaire. He spends one day observing the realization of the different tasks, and, with a human factor expert, they assist in a crisis management exercise.

  • Diagnosis: The first indicator provides insight into operational agents’ adaptive capacity and the margin of maneuver provided by the system for overcoming the variability of routine situations. Operational agents demonstrate a good knowledge about the complexity of their tasks and find trade-offs between the different performance dimensions. They consider beginners’ training and fear management, procedures modification, tasks risks perception, and technological failures as sources of disturbances. They acknowledge having sufficient temporal margins of manoeuvre but have to compensate for human and technical resources’ unavailability with increasing communication and coordination. Communication is an essential dimension of performance. The agent’s objective is to deliver the right message to the right person at the right time. They use informal communication networks and personal information tools to complete the formal communication system. Many situations, incidents, delays, and malfunctions require an adaptive response. Managers have abilities to compensate for the absence of operational agents in performing their tasks. Agents take the initiatives to perform tasks. The hierarchy provides temporal margins, even if it creates some delays in the finalization of the process. Nevertheless, they require operational agents to follow procedures.

The second indicator addresses operational agents’ adaptive capacity and the margins of manoeuvre provided by the system for overcoming abnormal situations such as incidents or accidents. Agents distinguished four types of abnormal situations: increased workload, safety incident into the station, crisis managed by the station, crisis managed by an authority external of the station. These situations induce increased tasks to achieve verbal and physical aggression, stress, unavailability of resources, difficulty or impossibility to apply procedures, or leadership and authority constraints. A culture of mutual assistance between agents and the “pride of the railwayman” contribute to agents’ efforts for adaptation required to overcome disturbances.

  • Definition of an action plan: The technical coordinator presents the assessment results to the agent interviewed and representative of the train arrival/departure processes. For the two indicators, vulnerability and resilience factors identified were presented, illustrated, and discussed. The five essential resilience and vulnerability factors were selected. For each of them, brainstorming was conducted to identify short- and long-term changes to prevent vulnerability factors and preserve resilience factors. Solutions emerge; nevertheless, the absence of an available budget makes their application difficult. One feasible solution identified is the integration of the resilience engineering issues within human factors training already planned.

10 Conclusion

This chapter presents a methodological framework dedicated to assessing the capacity to respond to the diversity of situations that may affect a sociotechnical system. This framework uses traditional qualitative data collection methods and the theoretical background of resilience engineering.

The application of the method allows the identification of a set of lessons:

  1. 1.

    Agents are willing to discuss how they adapt when disturbances happen.

  2. 2.

    Talking about actions at the limit or outside the procedural context is complex with the hierarchy.

  3. 3.

    Budget optimization policies make difficult the realization of changes aimed at resolving vulnerability factors and preserving resilience factors.

The following steps in developing and validating the framework consist of applying it to another sociotechnical system and adapting it to consider cities’ resilience.