Keywords

1 Introduction

Safety Management Systems (SMS) are a systematic approach to manage safety, including the necessary organizational structures, accountabilities, policies, and procedures [1]. Actual SMS normally incorporate a Safety-I perspective defining safety as the absence of things that go wrong (e.g. occurrences, incidents and accidents). From this perspective, such adverse events are caused by technical, human and/or organizational failure or malfunctioning [2, 3]. To avoid adverse events in future, their causes have to be identified and appropriate preventive measures have to be taken. Biased by hindsight the respective analyzes often reveal human behavior that deviates from the prescribed one-best-way to carry out an activity. Such deviations are seen as major contributions to the adverse events and hence as a risk. Thus, Safety-I views the human as a risk factor, potentially harming the system due to the variability in his behavior. To mitigate this risk, variability in human performance is restricted by means of standardization, regulation, automation and the like. From this perspective, reduction of human scope of action is required to guarantee stable results, i.e. to successfully achieve the aimed outcome of the activity. However, the operating conditions under which an activity normally has to be performed are dynamic and constantly changing [2, 3]. As a consequence, reducing scope of action restricts the humans’ potentiality to adapt to the dynamics of operating conditions and therefore their chance to deliver stable results as outcome of their activities. Indeed, it is the need to adapt to dynamic operating conditions in order to achieve an aimed objective that creates variability in the way humans are carrying out activities [4]. Against this background, unsuitable restrictions of human scope of action does not only not provide to more stable results, it even hampers it. Therefore, standardization and the like may reduce system safety.

For this reason, the Safety-I based SMS-approach needs to be critically reflected regarding the appropriateness of standardizations and restrictions of human scope of action it delivers. To avoid unsuitable restrictions the Safety-I approach needs to be complemented by the Safety-II perspective. In contrast to Safety-I, Safety-II defines safety not as the absence of things that go wrong but rather as the presence of things that go right [2, 3]. It recognizes performance adjustments and hence performance variability as the basis for successful adaptation to dynamic operating conditions and therefore for successful performance. Since the human has the ability to adjust behavior to situational dynamics, Safety-II considers the human not a risk factor only but also a safety factor and a main source for system resilience and safety. Precondition to deploy this ability is scope of action.

2 Objectives and Proceeding of the Project

The Safety-II approach is very well founded in theory. However, it is much less clear how to implement the Safety-II perspective into existing SMS. It was therefore the objective of this project to develop a safety tool based on Safety-II assumptions, which can be integrated into existing Safety-I based SMS. Safety-II aims not at substituting Safety-I but at supporting and completing it. Correspondently, a Safety-II based tool is not aiming at contradicting Safety-I based means to assure safety but at improving them.

In order to concretize the focus of the Safety-II based tool as well as the framework conditions of its application, nine semi-standardized qualitative interviews with safety experts from aviation have been conducted. By means of a qualitative content analysis, specific objectives regarding what the tool should be able to achieve were identified. These objectives are described in the next section. Furthermore, the analysis also revealed how the tool should achieve its objectives – i.e. which framework conditions it should respect. As a result, a tool that is easy for appliers to understand, that produces clear results and high acceptance among appliers, that does not require excessive resources and that can be easily integrated into everyday work was envisioned.

Based on the identified objectives, indicators have been developed in order to evaluate the tool. Through pilot applications of the tool in different organizations, it could be assessed that these indicators were achieved. Hence, the Safety-II based tool as developed in this project is suitable to achieve the objectives as they were set by the safety experts. It is described in the following sections.

3 The Measure Evaluation Tool (MET)

The tool’s purpose is to complement actual SMS by incorporating a Safety-II based approach into SMS. However, foci need to be set as the resources for applying the tool are limited. One possibility to set a focus is to use the tool for evaluating measures generated by the traditional, Safety-I based approach. Such measures – as outlined above – typically base on the analyzes of adverse events, identify deviations from the one-best-way as root causes and aim at protecting the one-best-way by means of standard operating procedures (SOP’s) and the like. At this point, the tool can be used to analyze whether or not the Safety-I based measures create new risks as a side effect by depriving the human performer from scope of action required for the necessary performance adjustments in everyday operations. Setting this focus, the tool can be integrated into traditional, Safety-I based SMS thereby supporting the improvement of safety measures from both the Safety-I and the Safety-II perspective combined. Against this background, the tool has been named MET, i.e. Measure Evaluation Tool.

3.1 Objectives of the MET

The MET supports the identification of risks and side effects of safety measures that reduce the human scope of action. To reach this aim, the following objectives are set for the MET:

  1. (1)

    Describing the work as it really is: The MET does not base on the assumption of idealistic operating conditions as it usually is the case for Safety-I based safety measures. It identifies and describes concrete operating conditions under which the human normally has to perform work. It therefore describes work as it really is, i.e. work as done.

  2. (2)

    Identification of prioritizations: The MET recognizes that operating conditions are dynamic. It recognizes therefore that the human, based on the actual operating conditions, has to constantly adjust his performance. To do so, he puts more or less effort in an activity based on situational circumstances and thereby he prioritizes certain activities over others.

  3. (3)

    Identification of the necessity of prioritizations: By identifying actual operating conditions, the MET is able to discern «why» and «how» prioritizations are made. Whereas the former describes the reason for performance adjustments, the latter identifies concrete decision-making criteria for prioritizing under everyday operating conditions.

  4. (4)

    Identification of strengths and weaknesses of prioritizations: Based on the identified operating conditions, the MET allows on the one hand understanding what the strengths of prioritizations are. Every appropriate way in which prioritizations are made in order to adapt to the operating conditions present in a specific situation to achieve a specific objective is considered a strength of prioritizations. On the other hand, the MET also allows understanding what the weaknesses of prioritizations are. Every inappropriate way in which prioritizations are made is considered a weakness of prioritizations.

  5. (5)

    Identification of risks and side effects of Safety-I based safety measures: Based on strengths and weaknesses of prioritizations, risks and side effects of Safety-I based safety measures are identified. These emerge from a reduced scope of action that hampers appropriate prioritizations.

  6. (6)

    Support for the development/improvement of safety measures: The identification of risks and side effects allows improving safety measures.

  7. (7)

    Integrating the tool in actual, Safety-I based SMS: Finally, it is an important objective of the project to develop a practicable tool that can be integrated into existing Safety-I based SMS complementing them with the Safety-II perspective.

3.2 Conceptual Background of the MET

The conceptual background of the MET is depicted in Fig. 1. It mainly refers to the above-mentioned necessity of performance adjustments, i.e. on the need to adapt the way of carrying out a certain activity to concrete and dynamic operating conditions. Whenever the human performs an activity, he has to balance efficiency and thoroughness (cf. Efficiency-Thoroughness Trade-Off (ETTO); [5]). Performing an activity thorough means to do it perfectly accurate without negligence or omissions. However, perfection of thoroughness is not achievable as there is always the possibility to perform an activity even more thoroughly. Perfect thoroughness therefore would require – at least theoretically – infinite time. As time is always limited, the activity has to be completed in due time and hence the human performer needs to decide on an appropriate level of thoroughness. By this decision, be it taken consciously or unconsciously, the performance’s efficiency and thoroughness when carrying out the activity is balanced. Figure 1 depicts this continuum of efficiency and thoroughness on the axis of ordinates.

Fig. 1.
figure 1

Conceptual background of the MET

Safety-I based regulations often define a one-best-way for performing an activity (e.g. by SOP’s). Thereby, ideal operating conditions are assumed for activity performance. Deviating from the prescribed one-best-way is hence considered a violation of the regulations. The evaluation scheme is bimodal, i.e. the one-best-way is considered right, any deviation from it is considered wrong. Figure 1 depicts the one-best-way as Safety-I Optimum. However, operating conditions are normally dynamic and not ideal (e.g. lack of time as many different activities need to be performed within a limited time slot). As a consequence, the human is forced to adjust his performance to the current situation by balancing efficiency and thoroughness in a way meeting a concrete situation’s actual requirements. This creates variability in human performance, depicted in Fig. 1 as a winding line on the continuum of efficiency and thoroughness. In contrast to the viewpoint of Safety-I, Safety-II considers this performance variability not a deviation, but a necessary adjustment to normal operating conditions. From this point of view, performance variability is not only no violation. It even is the main human contribution to successful performance as it allows for resilient coping with non-idealistic, i.e. limited and dynamic operating conditions.

Even more, carrying out an activity perfectly thorough must not necessarily be safe. In emergency cases for example, safety may require quick rather than thorough action. Otherwise, the patient dies before the emergency physician comes up with a thorough diagnosis. Also, in normal operations activities need to be completed in due time in order to be safe. Endless analysis for example regarding the appropriateness of a measure may prevent from taking any measure at all, which in turn does not make the system safer. Consequently, the Safety-I Optimum as depicted in Fig. 1 also represents a Safety Boundary related to Efficiency. Going above this limit endangers safety due to too little efficiency, although this might be rare in workaday life.

On the other hand, a total prioritization of efficiency over thoroughness would – at least theoretically – mean to deliver zero thoroughness. Consequently, there is a lower limit for ETTO depicted as Safety Boundary related to Thoroughness in Fig. 1. This limit represents a safety boundary to performance variability. Going below it would endanger safety from both perspectives, Safety-I and Safety-II.

However, different to Safety-I, Safety-II takes the position that not only the one-best-way of performing an activity is safe but all the range between the Safety Boundary related to Efficiency as upper limit and the Safety Boundary related to Thoroughness as lower limit. The respective range is labeled Safety-II Optimum in Fig. 1 as it balances efficiency and thoroughness optimally for a specific activity in a specific situation. This also takes into consideration other activities that have to be performed in a specific situation, as time for performing a specific activity may be limited by the necessity to perform other activities as well. Consequently, the need for ETTO within an activity is not necessarily dependent on the activity itself only, but on the total workload.

Of course, the minimum thoroughness respectively the Safety Boundary related to Thoroughness is not known. However, crucial for the MET-concept is that variability in human performance within the optimal range is not considered a performance deviation but a normal performance that is both, necessary (as the adjustment is required by the situation) and adequate (as the adjustment is enhancing safety). Therefore, it is also considered safe, although it deviates from the one-best-way defined by Safety-I based regulations. Even more and in accordance with the Safety-II assumptions, variability is considered just normal in everyday work.

Against this background, it is crucial for safety that prioritizations are made appropriately, i.e. in a way adequately taking into account a situation’s specific operating conditions. To support this adequacy, relevant decision-making criteria for balancing efficiency and thoroughness need to be understood. Systematically identifying such criteria is the main gain of the MET. It guides through several steps of analysis that are suitable to identify an activity’s everyday operating conditions. Thereby it distinguishes between primary operating conditions that cause the need for performance adjustments (e.g. limited resources), and secondary operating conditions that are decisive when the performance concretely is adjusted, i.e. when a concrete balance of efficiency and thoroughness is established (e.g. activity’s risk). Both types of operating conditions need to be considered adequately when deciding on performance adjustments. Hence, they are the respective decision-making criteria. The MET refers to the FRAM method [6] for conceptualizing different kinds of operating conditions (cf. Sect. 3.3).

The steps of the MET are described in the following section.

3.3 Steps of the MET

Following, the eight steps of the MET are described:

  1. (1)

    In a first step, the core activity regulated by the safety measure to be evaluated is identified. The core activity is the very activity in the focus of the analysis when applying the MET. To identify the core activity, the activity where the prioritization (from a Safety-I perspective considered a deviation and hence a violation) took place has to be recognized. By analyzing prioritizations performed in order to adapt to varying operating conditions, it can be understood (in the further steps of the MET) how operating conditions are in reality – and not as imagined as often it is the case when regulations are developed. On the basis of this first step, it is subsequently possible to identify the work as done (objective 1) and to identify the prioritizations that the human performer normally has to do (objective 2).

  2. (2)

    In the second step, the concept of operating conditions is introduced. Based on the FRAM method [6], the MET considers inputs, preconditions, time, controls and resources as the central operating conditions of activities that force the human to adjust his performance in order to achieve specific set objectives, i.e. the outputs of an activity. These outputs are defined as that which is the result of an activity. In the MET, the outputs are not considered operating conditions as they do not influence how activities are performed. Instead, they result from it. The inputs are that which starts the activity and/or is used or transformed to produce the outputs. Preconditions are conditions that must be fulfilled before an activity can be carried out. Time corresponds to temporal aspects that affect how an activity is performed. Controls correspond to that which supervises or regulates an activity (e.g. plans, procedures, guidelines or other activities). Resources are that which is needed or consumed for the performance of an activity (e.g. matter, energy, competence, software, manpower). All these definitions correspond with the respective definitions used by Hollnagel in the FRAM method [6].

  3. (3)

    In a third step, the outputs are identified in order to differentiate between wanted and unwanted outcomes of the core activity.

  4. (4)

    In the fourth step, the inputs are identified (first part of objective 1). In addition, variabilities that can arise in the inputs are identified in this step. As mentioned, operating conditions are over the time respectively in different situations never exactly the same. In this step it should therefore be described how the inputs of the core activity usually are and how they typically variate in different situations at different times.

  5. (5)

    In a fifth step, the remaining operating conditions and the variabilities that typically arise in these operating conditions are identified (second part of objective 1). In this step, for all the operating conditions is further specified if they are primary or secondary operating conditions, whereby some of them can be both. Primary operating conditions cause the necessity for performance adjustments. They determine the reason «why» there must be a performance adjustment. Through their identification, the necessity of prioritizations can therefore be acknowledged (objective 2 & 3). Scarcity of time is an example for a primary operating condition because, in order to achieve an output of the core activity in a given timeframe, this activity must be prioritized over other activities leading to a performance adjustment. Often, primary operating conditions are not activity specific but instead they result from the set of the activities to be performed in a specific situation (e.g. too many activities must be performed in too little time). Secondary operating conditions provide decision-making criteria for a performance adjustment. Hence, they determine the «how» of the prioritization. Riskiness of an activity (e.g. resulting from lacking resources) is an example for a secondary operating condition. An activity with a high riskiness needs to be performed more thoroughly than one with a low riskiness. When there is a need to prioritize between two activities because there is a primary operating condition in place, the secondary operating condition determines which of the two activities is prioritized over the other. Secondary operating conditions are activity specific because they base on the characteristics of a specific activity.

  6. (6)

    In the sixth step, the importance of the operating conditions (i.e. how much they influence the decision of prioritization) is determined. More important operating conditions are used as a basis for suggestions aiming at improving the safety measure in the eighth and final step.

  7. (7)

    The consequences of the prioritization are identified in the seventh step. Through this step, strengths and weaknesses of the prioritization (objective 4) and possible side effects of safety measures (objective 5) are assessed. As mentioned, every appropriate way in which prioritizations are made should be considered a strength, every inappropriate way a weakness. Only ways that enhance the probability of achieving wanted outcomes respectively of avoiding unwanted ones are appropriate and should therefore be supported by safety measures. Hindering the possibility to make appropriate prioritizations is considered a side effect of safety measures.

  8. (8)

    In the eighth step, suggestions for developing/improving the safety measure are formulated (objective 6). From a Safety-I perspective, it is first focused on the most important primary operating conditions (identified in the fifth and sixth step) that should be controlled. This means that the necessity of prioritizations caused by these operating conditions should be reduced. As a consequence, the safety measure would result to be more appropriate for the operating conditions that really are present in typical situations where the human performs the activity. From a Safety-II perspective, it is then discussed which decisions of prioritizations should be supported and which should instead be hindered. This discussion is based on the secondary operating conditions identified in the fifth step. Hereby, only prioritizations that are identified to be appropriate should be supported.

3.4 Example of a Concrete Case

In the following section, a fictional case is described in order to illustrate the theoretical background of the project and how the MET can be applied:

In an acute care hospital, the administration of medicine to the patients is regulated by a safety measure that imposes that, before nurses administer medicines to the patients, another nurse must have checked the medicines. However, the measure does not consider that the acute care hospital lacks nurses and because of the very big amount of patients, nurses typically do not have enough time to check the medicines of every patient before they are administrated. This means that in practice it is actually impossible to follow the safety measure as prescribed. Theoretically, it might be possible, but this would be at the cost of doing other safety-critical activities less thorough. In order to overcome this problem, nurses have always more thoroughly checked the medicines that must be administered to the patients that they consider «risk cases»–i.e. patients for which a mistake in the medicine administration would have severe consequences for their health. The medicines that had to be administered to the patients not considered «risk cases» were only checked thoroughlyor checked at allif there was enough time. In this way, dangerous administrations of medicines could be avoided, respectively reduced as much as possible. However, after a patient died because of the administration of the wrong medicines, the safety measure has been strengthened. The check of the medicines by the nurses must now be documented with the signing of a document of approval. Since the measure cannot be followed because it does not take into account the typical operating conditions under which the hospital’s personnel actually has to work, nurses are forced to continue deviating from the safety measure. As a consequence, nurses often sign the documents at the end of their work shift, not necessarily really having checked the medicines administered to the patients not considered «risk cases».

In this example, the typical approach of Safety-I is described: If an occurrence takes place, its cause is typically found in human behavior that deviates from the right way to perform an activity (or the one way that is considered right, i.e. the one-best-way specified in the safety measure). Thus, to mitigate this risk, variability in human performance (i.e. the possibility to deviate from the safety measure) is further restricted. Following the steps of the MET’s application are described with reference to the example.

  1. (1)

    In the first step, the core activity is identified. In the present case, the core activity is «checking the medicines» because in this activity, a prioritization (from the Safety-I perspective considered a violation) took place.

  2. (2)

    In the second step, the concept of operating conditions is introduced. This step can be carried out independently from the specific case analyzed.

  3. (3)

    In the third step, the outputs of the core activity are identified. In this case, the main output corresponds to «checked correctness of medicine» . Based on the identification of the outputs, it can be differentiated between the wanted outcomes (e.g. incorrect medicine has been identified) and the unwanted outcomes (e.g. incorrect medicine has not been identified).

  4. (4)

    Further, in the fourth step the inputs and the variabilities that typically arise in it are identified. In this case, the main input corresponds to receiving the medicines that must be checked. Typical variabilities in the input are for example that medicines sometimes arrive too late or that they are incorrect or incomplete.

  5. (5)

    In the fifth step, it is possible to identify the other operating conditions under which the hospital’s personnel actually has to work (i.e. preconditions, time, controls and resources) and how they typically variate. Such, the MET allows to identify among other factors that there is normally not enough time to perform the core activity because there are not enough resources. Thanks to the identification of the operating conditions and how they typically variate, it is furthermore acknowledged why and how the hospital’s personnel has to prioritize. By doing so, it is distinguished between primary and secondary operating conditions. In this case, the lack of time and resources are identified as primary operating conditions. In fact, these force the hospital’s personnel to prioritize certain checks of medicines over others. In the presented case, depending on the perceived severity of the patients’ condition, nurses decide which medicines to check more thoroughly, i.e. the check of which medicines is prioritized. This operating condition describes how, i.e. based on which criteria, the hospital’s personnel decides to prioritize medicines to be checked. It is therefore a decision-making criterion for a performance adjustment, and hence a secondary operating condition.

  6. (6)

    In the sixth step, the importance of the identified operating conditions is assessed. In the case, lack of time and resources and the perceived severity of the patients’ condition are assessed as important operating conditions because they strongly influence the performance of the core activity.

  7. (7)

    The consequences of the prioritization are identified in the seventh step. In this case, consequences are that the medicines are checked more thoroughly for the most critical patients. Therefore, the probability that these patients get the correct medicines is enhanced and the probability that they get incorrect medicines is reduced – i.e. the achievement of wanted outcomes identified in the third step respectively the prevention against unwanted outcomes are improved. This means that constraining the possibility to check the medicines less thoroughly destined for non-critical patients in order to have enough resources to check the medicines more thoroughly destined for critical patients hinders a crucial strength of the prioritization and is therefore considered an unwanted side effect of the safety measure.

  8. (8)

    Thanks to the MET, based on the most important primary operating conditions identified in the fifth and sixth steps, in the eighth step the safety measure could be improved from a Safety-I perspective by reducing the necessity to prioritize as much as possible. This could be achieved, for example, by hiring more health personnel or providing more time to check the medicines. However, as workload in hospitals normally is not equally distributed, there will always be workload peaks causing time pressure. Therefore, the safety measure should also be improved from a Safety-II perspective, formulating criteria based on which it should be decided how to prioritize accurately when checking medicine. To do so, reliable criteria are required. The identification of the most important secondary operating conditions in the fifth and sixth steps allows in the eighth step to identify such criteria. In this specific case, in fact, it could be discussed if the perceived severity of the patients’ condition can be considered an appropriate criterion based on which it should be decided how to prioritize. If so, safety measures have to make sure that the perception of severity of the patients’ condition is reliable. It must be avoided that health personnel considers «risk cases» non-risky. Accurate perception of the patients’ condition makes sure that performance adjustments are made in a way not endangering safety.

4 Discussion

In the complexity of today’s world, the safety approach of traditional SMS comes to its limits. Traditional SMS often introduce regulations that describe the one-best-way of performance for every activity and then require from the human to follow it perfectly thoroughly. Through this process, SMS aim at standardizing performance assuming that carrying out activities always in the exactly same and one-best-way is suitable to avoid adverse events and hence ensures safety. Every deviation from this one-best-way is therefore considered a violation. Consequently, because of the variability of his performance when performing activities, the human is seen as a risk factor. This perspective is known as Safety-I. A more comprehensive perspective, known as Safety-II, points out that the operating conditions under which activities must be performed are never ideal nor perfectly stable. It is therefore not possible to set a one-best-way for every activity in every situation. In order to adapt to the unideal, varying operating conditions, a certain amount of variability is needed. Variability in human performance – i.e. performance adjustments – is in fact the result of adaptation to the operating conditions present in a specific situation and should therefore not only be considered a risk factor but also a crucial safety factor. Reducing human scope of action when adjustment performances are actually needed can hinder safety and should therefore be considered an unwanted side effect of safety measures [3, 4].

The Measure Evaluation Tool (MET), devolved in this project and in close collaboration with safety experts from aviation, bases on the Safety-II assumptions. Thanks to the MET, actual operating conditions and their typical variations can be identified. This allows understanding why and how performance adjustments are needed. The results of the MET allow therefore to improve Safety-I based measures by reducing their side effects.

Strengths of the MET: A major strength of the MET is that it allows improving SMS with a new perspective that is appropriate for the complexity of today’s reality. Furthermore, the MET is thought to be integrated in traditional SMS and can therefore be considered a practicable tool actually able to effectively apply the theoretical assumptions on which it is based.

Weakness of the MET: To fully implement the Safety-II assumptions, an organization’s safety culture needs to develop. Among others, deviations from regulations should not be considered a priori violations and the need for variability and performance adjustments should be recognized. The MET alone can of course not change the safety culture of an organization. This can be considered a weakness of the tool. However, thanks to the MET, a first important step in the right direction can be made.

Practical implications: It is strongly recommended to integrate the Safety-II perspective in actual SMS. The MET represents a concrete way to realize this purpose allowing the recognition and avoidance of inappropriate, excessive standardization of work. Considering the complexity of today’s organizations, optimizing human scope of action is the only way to make systems safer, hindering human weaknesses while supporting human strengths. Implementing the MET in aviation organizations would therefore improve their safety.