Keywords

1 Introduction

Resilience engineering (RE) has been suggested to represent an innovative approach to improving organisational health and safety management [1, 2]. However, RE is relatively new to many human factors, safety practitioners and safety engineers, so its ability to deliver any improvements has been questioned [3]. In addition, compared to other strategies such as regulations, managements systems, safety culture or risk management; published research on RE appears patchy and disconnected. For example, publications on the topic frequently uses organizational resilience (OR) and RE interchangeably, suggesting they are one and the same. The language used in some of the published works refers to a multitude of characteristics and properties, making it a “semantically overloaded term in the sense that it means somewhat different things in different fields” [4]. Hence trying to develop a nuanced understanding of what it is (or not), can be a significant challenge for academic and researchers seeking to conduct research in RE, and for practitioners seeking to translate it into practice. This is a significant gap in the literature, which this paper aims to address, through a review of the state-of-the-art. The paper is organised as follows. First, the research method used for informing this reviews are discussed. Next, the landscape of RE is presented, followed by an analysis of published research in terms of industrial context, definitions, and dimensions/factors/measures. The paper concludes with a summary of main gaps in the literature, proposes a working definition and identifies a series of areas can for advancing research and practice.

2 Research Method

The search and selection of the literature, adapted from Hale et al. [5], included four (4) stages as illustrated in Fig. 1.

Fig. 1
figure 1

Literature search and selection process

The first involved a systematic exploration of three electronic databases; Psyc INFO, Social Science Index (SSCI) and CINAHL, through the EBSCOHost platform, using organisational resilience and resilience engineering as key search terms, for publications between January 1988 and December 2012. The initial search resulted in 4 and 98 publications. The next included screening the titles and abstracts of the 103 articles to identify those that did not focus on health and safety management. The 66 articles removed at this stage focused on resilience in communities, psychological, children and youth, climate change, ecology and sustainability being removed. At the third stage, the remaining 37 full-articles were used as the basis for searching for further book chapters, conference proceedings and reports through Google Scholar using the titles or first author as keywords. An additional twenty (20) articles were added at this stage resulting in a sample of 57 papers for selection. The fourth and final stage involved applying the inclusion criteria of (a) peer-reviewed, (b) publications in English language, and (c) include one or more of the following: (i) definitions of the terms, (ii) measures, dimensions/factors or key concepts. A total of forty-six papers were included in the final review.

The next section charts the development of RE as a field of research and practice.

3 The Landscape of Resilience Engineering

The development of any new approach for improving organisational health and safety management has generally followed major catastrophes, and RE is no exception. The Columbia space shuttle disaster has been suggested to be a good starting point [6]. On February 1, 2003 Columbia disintegrated upon re-entry to earth, killing all seven crew members [7]. RE emerged as a natural strategy to address a number of shortfalls in organizational behaviour in industries operating with similar cultural and organizational environments as Columbia. Following the release of the final report on the Columbia disaster, the first international symposium on RE was held in Sweden, and this acted as a catalyst for Resilience Engineering Concepts and Precepts [8]. This represents the seminal work in this field. Since then RE has gained momentum, with a steady stream of research being published from a range of domains. It is still relatively young as a field of research and practice [9] with a lot of scope for new and emerging academics, practitioners and researchers.

However, while RE is new, the ideas and concepts that it builds on have been part of the organizational and risk management literature for over two decades. The notion of resilience was first discussed by Wildavsky [10], who extended the ideas of ecological resilience into the safety domain. It has also been associated with high reliability organizations [11, 12], and more recently with the fifth age [13] or era; hence represents an evolution of health safety management practice.

3.1 Industries Contributing to Research

The review identified 37/46 articles included an industrial context, and a summary of the industries is illustrated in Fig. 2. The industries that were most represented included healthcare and petro-chemical, followed by aviation and nuclear; with a few articles also published from manufacturing, construction, electricity distribution and railways. The industries which have been highly represented include those who are very complex and technical in terms of their operations and processes, and they generally employ highly skilled personnel. Such organisations have been generally regarded as ‘tightly coupled and interactively complex’ [14].

Fig. 2
figure 2

RE Publications by Industry 1998–2012

This analysis by industry reveals one significant gap; there is very little in terms of published research on RE from traditional industries such as manufacturing, construction and mining; although propositions for the latter two have been suggested [15, 16].

3.2 Definitions

Being a relative new field of study in health and safety, it could be assumed that there is a common understanding of resilience, or of resilience engineering. A close examination of the literature surveyed suggests variations in the way different researchers have interpreted resilience, organisational resilience and resilience, with many using the terms interchangeably. One would tend to agree with it being described as ‘confused consensus’ [17]; in spite of the increased number of peer-reviewed publications on organisational resilience and resilience engineering, there is no single way in which these terms have been defined. The explanations appear to vary between the contexts in which they have been investigated. The next section looks at resilience as it relates to organisations.

Organisational Resilience. The literature surveyed suggests there are at least seventeen definitions of organisational resilience (OR). Sheridan captures the concept best as ‘a family of ideas’ [3]. Most of these suggest it is an ability, capability or capacity which is associated with being able to recover size and structure [18], withstanding major disruptions [19], absorbing disturbances and change [20], maintaining function and structure [21], bouncing back from adversity [10], handling disruptions and variations [22] and recovering to a stable state [23]. These notions of resilience, however, are restrictive and tied to reactive models of organisational health and safety. Moreover, they soften the importance of resilience, considering its influence in achieving ‘nearly accident free performance’ in highly-hazardous organisations [24, 25]. Safety efforts in these groups of ‘high-reliability organisations’ was not simply about bouncing back from adverse events; instead, it extended to guarding against potential minor mishaps and performance variations in normal operations escalating into major breakdowns of organisational processes. This ability enable organisations to anticipate and circumvent threats’ [26, 27], with the recovery occurring very early in the process [28]. In addition to being reactive, it also has a proactive side [29], in that resilient organisations see safety both as a non-event (i.e. success) such as near-misses, and events such as failures, incidents or disasters [30]. This is an important distinction between organisational resilience and many other safety management approaches such as management systems, regulations or safety culture.

The above are all useful definitions, and the outcomes they seek to achieve are desirable, in fact required, if organisations are to survive in the current times of turbulence and uncertainty. However, all organisations should be able to bounce back from adversity. Current organisational theory suggests that organisations are complex adaptive systems [31], which could then mean that all organisations would be deemed resilient, at least to some degree.

However, two things appear to set resilient organisations apart from non-resilient ones.

The first is their ability to continue performing well without being affected significantly by threats and disturbances, and second is their ability to deal effectively with more than normal, every-day threats, and disturbances. This involves going beyond past experiences and preparing for unknown events, threats, and/or hazards; dealing with black swans or ‘unexampled hazards’ [32]. Instead of relying on successful experience of strategies, approaches and interventions previously deployed, these organisations continue to devote efforts in anticipating and preparing to deal better with hazards and threats they will face in the future. This ability is born out of a firm conviction that “unexpected trouble is ubiquitous and unpredictable, and thus accurate advance information on how to get it is in short supply” [10].

The discussion above suggests that organisational resilience covers a wide range of concepts, ranging from abstract to concrete facets, with each reflecting different solutions in terms of being reactive, proactive and adaptive, not only to prevent negative outcomes but also to support and strengthen outcomes of processes. There is also a great deal of uncertainty surrounding the notion, and the key ‘challenge facing researchers is to achieve a consensus on the definition’ [33]. However, whether such a consensus is necessary is questionable, because resilience itself is context-specific. Some organisations may be resilient in some aspects in comparison to others, and some sections of an organisation may display a propensity for resilience more than others. For this reason no attempt has been made to provide a unified definition.

Resilience Engineering. Similar to OR, there is no single definition of RE, largely because ‘it exists more as a conceptual framework than a tight knit knot’ [29]. The literature suggests four related but somewhat different definitions.

The first by Woods and Hollnagel [34] refers to RE as a ‘paradigm for safety management.’ This suggests it is a conceptual framework for safety management similar to human error [35]; systems, normal accidents [14] or high-reliability [21].

The second by Chialastri and Pozzi [36] suggests it involves adaptations to variations occurring beyond the design envelop of systems. This involves making temporary adjustments to an organisation’s process by responding, monitoring, anticipation, and learning from disturbances, changes, major mishaps, or continuous stressors [37]. From the previous discussions on organisational resilience, this requires (i) responding to regular, irregular and ‘unexampled’ threats (ii) monitoring what was going on; (iii) anticipating risks and opportunities over the longer term, and (iv) learning from experience. Because an individual was not expected to possess all these four abilities, these were characteristics of organisations which are comprised collectively of groups, systems and processes [31].

The third suggests that RE involved ‘developing an organisation’s behavioural and cognitive capability such that it is able to effectively adjust and continue performing optimally near its safe operating envelop, in the presence of everyday threats and environmental stressors at all levels of the organisation’ [16]. Behaviours are well known and well researched in health and safety, they largely comprise of actions taken by people at different levels of the system. Following rules/procedures or violating them can be observed through behaviours. Cognition involves a mental process of thinking, attending to information, processing and ordering that information to create meaning; and is closely associated with sense-making [38].

The fourth, Heikkila et al. [39], refers to it was a new way of thinking about the management of safety, pointing out what was different in RE: ‘Whereas conventional risk management approaches are based on hindsight and emphasize error tabulation and calculation of failure probabilities, resilience engineering looks for ways to enhance the ability of organisations to create processes that are robust yet flexible, to monitor and revise risk model, and to use resources proactively in the face of disruptions or ongoing production and economic pressures.’

This brief analysis suggests that, similar to OR there is no universally accepted definition of RE. What is apparent is that it represents a sophisticated way for managing safety and risks. The sophistication is not so much in the technology, but more in the way one thinks about safety, and how it can be better managed through existing tools but in more innovative ways. This also involves a shift in perspectives, or way of thinking, about health and safety and how can be managed. These tenets can be used to formulate a unified understanding of RE. However, before doing this, it is useful to understand how RE has been theoretically conceptualised in previous research. This is discussed in the next section, by examining dimensions, factors and measures.

3.3 Dimensions, Factors and Measures

The analysis of RE definition pointed to a magnitude of dimensions, factors and measures which can be useful in informing research and practice. These include culture, cognition, behaviours, levels and the gap between work-as-imagined and work-as-performed.

Culture and RE. The first connections between resilience and culture was proposed by Reason [40], who integrated Mintzberg’s (1989) three drivers of commitment, competence and cognisance with principles, policies, procedures and practices into a Checklist for Assessing Institutional Resilience (CAIR); a questionnaire for assessing organisational resilience quantitatively. The tool was further developed and used to measure resilience in aviation [41] and healthcare [42]. Flin [26] introduced ‘managerial resilience culture’ where commitment to safety was guided by a firm belief that when safety and production goals conflicted, managers ensured that safety predominated, and a climate of (i) workers and managers being more able to speak up when they were concerned about safety, (ii) workers being assured of not being penalised when they challenged their superiors, stopped production or expressed their concerns about safety risks. Wreathall [43] provided an initial set of ‘themes of highly resilient organisations’ which included (i) top-level management commitment, (ii) just culture, (iii) learning culture, (iv) awareness, (v) preparedness, (vi) flexibility and (vii) opacity; arguing there was a ‘need to tie these approach to the concepts of resilience’. The first three of these can be associated with Flin’s managerial resilience. Skogdalen [44] suggested these dimensions could be mapped into an organisational-human and technical factors model, the OMT method, while Bracco et al. [45] integrated these into a skills-rules-knowledge (SRK) framework for examining resilience in healthcare organisations.

Cognition and RE. Apart from culture, another dimension that has been published involves cognition. Back et al. [46] linked cognitive resilience with reflective management practice, arguing that identifying those strategies that people used to support performance in everyday situations was useful in identifying behaviours that enabled people to recognise and adapt to changes, disruptions and surprises created by the system. Back et al. [47] decomposed these at five levels of granularity, including (i) individual, (ii) small team, (iii) operational, (iv) plant and (v) industry. Bracco et al. [48], on the other hand, decomposed at the levels of skill (S), rules (R) and knowledge (K) suggested by Rasmussen (1983). Similar to culture, most of these papers are largely conceptual in nature, so more empirical investigations are necessary to clarify the relationship between RE and cognition.

Behaviors and RE. The idea that resilience is a behavioural characteristic arises from the work of Vogus and Sutcliffe [49], who linked this with the ability to make adjustments, through a hierarchical integration of behavioural systems whereby earlier structures and integrated incorporated into later structures in increasingly complex forms. Another set of behavioural indicators, decomposed according to vulnerabilities experienced across individual, small team, operational, plant and industry levels has also been suggested by Back et al. [47]. These papers are also conceptual, so more empirical investigations are necessary to clarify the relationship between RE and behaviours.

Levels of RE. The above section discussed three dimensions of RE that have been used for RE research. Implicit in the discussion of behavioural and cognitive RE is that it is distributed across a number of levels. This associated with the way it impacts a system at the different levels. One way of describing level is ‘granularity,’ suggested by Reason [40] who posited that resilience manifested at operational, management and organisational levels. McDonald [50] provided a similar decomposition at three layers, including operational, organisational and industrial. Reason’s CAIR has been applied in a limited context in aviation and healthcare, while decomposition of RE suggested by McDonald has not been the subject of any empirical studies.

The Gap between Work as Imagined and Work as Performed. The preceding sections considered a range of measures and factors that have been used to inform RE research and practice. One important factor which has also been suggested to be important involves the gap between work as imagined and work as performed [2, 17]. This is the distance between operations as management imagines they go on and how they actually go on in practice, and has been suggested to one of the most important factors. There have been some attempt to operationalise this [51, 52] with some level of success. However, very few studies have made an attempt to integrate the key dimensions, factors and measures and the concept of this gap into a unifying theoretical framework.

4 Summary, Research Gaps and Implications

The following points can be summarised based on this review. First, there is no universally definition of RE; hence there is no uniform way of assessing, examining, or observing it. Second, it is a theoretical construct, not a concrete element, substance, or entity which can be touched, felt or smelt. Third, it is multi-factorial and common factors used for exploring and/or measuring it includes culture, cognition and behaviour. Fourth, it is multi-dimensional, and some of the more common dimensions for investigating it include individual, small teams, operational, plant, industry. Fifth, it is linked in some way with balancing safety and performance. And sixth, the gap between work as imagined and work as performed is important for RE.

“In order to advance research and practice in RE, it is important to develop some consensus on what it actually is. The following working definition is proposed: Resilience engineering is a sophisticated way of managing organisational safety through the development of cognitive, behavioural, and cultural abilities to enable organisational members at all levels to actively anticipate, respond, monitor and learn to operate close to the boundary of safe operations as part of normal work, by narrowing the gap between work as imagined and work as performed.”

By framing this definition as above makes a number of things clear. One, RE is about organisational safety, not individual safety. Two, it incorporates cognition, behavioural and cultural aspects of an organisation. Three, although an individual can have all these attributes, it is only when they are collectively distributed across all levels of the organisation that these play a role in RE. Four, the cognitive, cultural and behavioural collectively enable the organisation to anticipate, respond, monitor and learn. Five, it is about operating as close as possible to the boundaries of failure as part of normal work. And six, the gap between work as imagined and work as performed is an important facet of resilience engineering.

The review also identified a number of gaps in research.

First, an analysis by industrial domains suggests current studies have mostly concentrated on complex organisations such as healthcare, petrochemical, aviation and nuclear. Missing is research from traditional high-risk industries such as construction, mining and manufacturing, and addressing this will be important in understanding where RE has a role to play in improving organisational health and safety management in these high-risk industries.

Second, an analysis of factors suggests that culture, cognition and behaviours play an important role; the specific mechanisms by which this is expected to occur, or what impacts they have (if any), is unclear. Third, analysis by dimensions and/or levels suggest that number of most research has been limited to examining single units or levels, even though current research in fields such as human error suggests that, at least on the level where work is done, risk and its management are likely to be influenced by other higher levels, such as managers, supervisors, associations and government.

And fourth, while the published papers provide a rich source of information on concepts, ideas and notions, many of these lacks a conceptual and theoretical framework which unify the complex constructs being investigated. In particular, theoretical framework that will be useful for examining the gap between work-as-imagined and work-as-performed is lacking, and addressing this gap is crucial for advancing research and practice in RE. In this regard, the framework of reflective practice used for investigating cognitive resilience [46] offers a promising start. Future papers will investigate how this can be used to inform research in RE.