1 Introduction

Over the 160 years of the oil and gas (O&G) industry, several operations, such as refining, transportation, production, drilling and storage, have come to characterise its activities. Since the genesis of this industry in 1859, drilling has been the activity that has shown most clearly that the risks involved are counterpointed by high rewards. However, these rewards can be lost in accidents such as the Deepwater Horizon (2010) or Quinton (2018). Thus, drilling has evolved technologically over the years, increasing its productivity, but still leaving a large part of the responsibility for safety to fall on workers. Having this in mind, it seems that at this moment in the industry’s evolution, when accidents still happen, the most sensible and effective move is to recognise and analyse the role of the most important link for safety: the human being, the worker. Moreover, understanding the evolution of drilling, and its increasing complexity, can enable researchers and industry personnel to develop an adequate perception of the interactions between the equipment, environment, organisations and workers, especially with regard to risks and safety issues, and therefore to prevent losses. In certain respects, the O&G industry is dealing successfully with this scenario. Official numbers show that more than 3000 offshore wells are drilled worldwide every year without any major incidents, and that safety performance record is confirmed by the blowout data reported in the industry (Strand and Lundteigen 2016). However, accidents have been showing that the current safety barriers and risk recognition techniques may be faulty and a different approach to understanding the complexity of and interactions in drilling platforms—also called oil rigs—should be taken. Such an approach should be grounded in a methodology that offers an alternative to the traditional, linear ways of thinking, ways of analysing situations, risks and interactions. FRAM provides a method to describe a sociotechnical system in terms of its functions and the interactions between these, to analyse where performance variability may arise and spread throughout a system, and how the system may adapt to keep performance within the required parameters (De Vries 2017). In this paper, it will be applied to understand the risks and recognise the relevant human factors and non-technical skills involved the drilling unit activities performed by drillers.

2 The evolution of human factors in the industry

The concepts of human reliability and the definition and early measurements of human error started with the empirical theories of Heinrich, and were further developed by other authors. They influenced risk assessment theories and discussions of industrial accidents throughout much of the twentieth century, especially the accidents of Three Mile Island (TMI) (1979), Bhopal (1984) and Piper Alpha (1988) (Turner and Pidgeon 1997). Seeking to integrate and comprehend technology and behaviour, human factors engineering (HFE) developed as a discipline that focused on the interactions between humans and technology, as well as on systems and processes, especially in the nuclear industry. Its aim is to discover and apply knowledge about human capabilities and limitations to system and equipment design, to ensure that the system designs, human tasks and work environments are compatible with the sensory, perceptual, cognitive and physical attributes of the personnel who operate systems and equipment (Hollnagel 2014). Bringing a balance and more consolidated perspective, Luquetti dos Santos et al (2013) have contended that human factors deal with issues related to humans, their behaviour and the physical aspects of the environments in which they work. In this context, ergonomics is an inter-disciplinary research field that focuses on improving the functioning of human–technology interactions involving safety, especially those that show the difference between work-as-imagined (WAI) and work-as-done (WAD).

The current understanding of human factors incorporates all those that can have influence on a person’s performance during their work, whether these originate from inside, outside or are part of the individual characteristics of that person. For IOGP (2018), human factors are simply those things that can influence what people do. They may include factors relating to the job people do (e.g. time available or control panel design), personnel factors (e.g. fatigue, capability), and organisational factors (roles, staffing levels). The idea is that during the events leading up to accidents, people are acting in a way that makes sense to them at the time. All their knowledge, training, experience, organisational culture, and input from the environment combine to influence the decisions made and the actions taken. In this way, human factors are not simply "what the human being does," or "the mistakes made by the worker"; they are much more than that and require a much greater understanding than simply blaming the human being for doing something wrong. In a labour context, human factors are the set of factors that influence workers in their labour activities, which can be individual, organisational, technological, environmental, among others, as represented by Fig. 1.

Fig. 1
figure 1

Human factors scheme. Source: Authors, 2020.

2.1 Non-technical skills

Individual skills and organisational characteristics that enhance safety in complex socio-technical systems are also considered in human factors engineering. These are known as non-technical skills, and are the cognitive and social skills required for productive and safe operations. The most relevant are situational awareness, decision-making, communication, teamwork and leadership (Flin et al. 2016). These skills are based on individual knowledge about the work, which can be classified as explicit or tacit. Explicit knowledge is that which can be easily systematised and communicated through standards, procedures and company rules (Crandall et al. 2006). Tacit knowledge, on the other hand, is difficult to recognise and formalise, despite being present in all workers’ activities and resulting from their interactions with all the elements present in their work environment. In particular, situational awareness is the set of individual perceptions of the relevant elements in the workplace, as determined by the interaction between system responses and human senses, and contributes to a person’s understanding of a particular scenario (Endsley 1995). Because those senses and perceptions are unique to each person, the understanding is individual and may have some limitations.

According to ARPANSA (2017), non-technical skills do not include the technical abilities required to get the job done, such as the know-how necessary to operate a machine or conduct a certain operation, which is provided through proper training and work experience. However, non-technical skills complement technical skills and knowledge, making them more efficient and effective. In this context, the limitations of human attention represent a key element of human information processing, and can be described in terms of three categories: focused, selective and divided attention (Wickens et al. 2016). This limitation also affects workers’ performance in offshore workplaces, where situational awareness plays an important role, helping workers to understand the scenario and divide their attention between tasks adequately, enabling the most suitable decision-making. It is important to mention that cognitive tunnelling, a form of focused attention can occur in emergency situations, blurring situational awareness and, consequently, the decision-making process.

The complexity of workplaces has evolved through the evolution of technology, creating an intense cognitive load and interactions and between workers and systems. In these interactions, human beings are characterised by their capacities to learn, to adapt and to plan, which are essential to comprehending complex and cognitive systems, given a certain context (Hollnagel and Woods 2005). Situational awareness can be related to the endless human capacities for learning and adapting. Individuals trust their cognitive ability to provide the necessary responses to their interactions. This response to the environment is a human ability, according to Nemeth (2004), which relies on their cognitive capabilities of perception, interpretation and response through their senses. The individual response is unique, but strongly influenced by other factors, such as organisational culture, environmental conditions and technological complexity—human factors. These factors can affect learning, performance and decision-making Nemeth (2004). In a complex sociotechnical system, such as an offshore oil rig, the interpretation of a situation and the consequent implementation of an action may be incomplete or incorrect, leading to unexpected results. This problem is often solved by relying on individual non-technical skills that, acting together, can built a safe work environment, in which risks are present, but managed. Human decision-making is powerful—and fragile—because of the ability to develop subjective criteria to answer system demands (Hollnagel and Woods 2005).

In this way, situational awareness can be understood as the worker’s ability to understand how the whole system in which he is inserted works, providing responses and interacting while continuously learning. It has become an increasingly prominent skill that is contributing to safety and operational performance in many technological areas (Salas and Dietz 2011), but most notably in the energy sector, in which the O&G industry stands out. Despite the prominence of this ability, especially in aeronautical safety, there are also questions about its structuration. In fact, the technological advances designed to improve safety and productivity also elevate system complexity (Salas and Dietz 2011), relying on human abilities to manage the complexity and human factors issues. In this context, and seeking to understand complex sociotechnical systems, FRAM arises as a suitable methodology that can, at the same time, recognise and analyse human factors and their related non-technical skills. Especially in the O&G industry, where high complexity, high risks and high skills are present in refineries, terminals and platform systems (França and Hollnagel 2019), the FRAM seems reasonable and adequate for comprehending operations and promoting safety.

3 The FRAM methodology

The Functional Resonance Analysis Method (FRAM) is a methodology for analysing and describing the nature of workday activities. Because of its structure, it can be used to analyse past events in a complex system, such as an accident investigation, as well as possible future events, such as the human factors recognition and analysis in a drilling unit of an offshore drilling rig. To a professional who has never seen the graphical representation of a FRAM model, this methodology may seem relatively complex, which it is not. In fact, the analysis promoted by this methodology is not an algorithmic process, but rather the gradual development of a mutual understanding among professionals working as a team. It is a kind of complex discussion about the complex relationships of complex socio-technical systems, but one done in a simple way (Hollnagel 2012). In this work, the FRAM is the methodology applied to understand how the real work is done by drillers inside a doghouse, showing how WAD occurs and how their performance can be affected by the human factors presents in a complex socio-technical system—an offshore oil rig.

Sociotechnical systems’ behaviour (interactions between social and technical elements with organisational and environmental issues) is heavily dependent on interactions within and between system components (Wooldridge et al 2019). This is true regardless of the occupation area, and is a dynamic that can be said to characterise operating rooms (OR) and paediatric intensive care unit (PICU), as well as refineries or offshore oil platforms. Each has different elements and different characteristics, but they are all complex sociotechnical systems. The FRAM (Functional Resonance Analysis Method) was found to be a valuable methodology for describing such systems and their human factors interactions, based on a strong grounding in empirical studies and themes of “making work visible,” symmetry between human and nonhuman, and work as activity. Indeed, FRAM supports describing the dynamic interactions in sociotechnical systems from the perspective of normal performance variability, which is necessary to understand how the real work is performed (Zheng, Tian and Zhao 2016).

To build a FRAM model, it is necessary to follow four steps. The first step is the identification and the description of the functions, which can be human, technological or organisational. The model seeks to describe in detail how a task is done as a real, everyday activity, rather than to describe it as an overall procedure. Once the function description is done, the second step is the recognition of the output variability of each function of the model, which involves characterising each function by its potential and actual performance variability (Hollnagel 2012). After the recognition of the output variability, a third step is needed, which is the examination of the instantiations of the model to understand how the potential variability of each function can become resonant, leading to unexpected results, as stated by the premises of the method. The fourth and last step is the monitoring and managing of the performance variability of each proposed instantiation, as identified by the functional resonance that characterises the performance variability of the method and can result in positive and negative outcomes. Hollnagel (2012) proposes that the most fruitful strategy consists of amplifying the positive effects (i.e. facilitating their occurrence without losing control of the activities), and damping the negative effects (i.e. eliminating and preventing their occurrence). Traditionally, it is used to create barriers to and defences against harmful situations. In this case, it is useful in the recognition and analysis of the relevant human factors.

4 Materials and methods

The application of observational methods is beneficial for understanding the demands and strategies that people have developed to deal with different contexts, in the same and in different environments (Crandall, Klein and Hoffman 2006). Interviews complement those methods by validating what was observed, providing privileged information on interactions, interpretations and attitudes. Elements of company’s organisational culture, for example, can be identified through observations and corroborated in interviews. In this context, despite costs and time constraints, onboard observation and face-to-face interviews (onboard and after shift) provided representative and valid information (Creswell and Creswell 2017). Face-to-face contact is a human connection ruled by empathy, and most of the time enabled the researcher to acquire information that would not be written down.

The open-ended approach chosen for the interviews conducted as part of this research, structured by one open question and notes, aims to find out how workers perceive things in their labour interactions and gives access to their first-hand experiences (Silverman 2017). It is important to know how the real work is done daily because this reveals the exact difference between WAI and WAD. The latter is a source of valuable information for understanding the real interactions in complex sociotechnical systems. According to Brinkmann and Kvale (2009), in narrative interviews, the interviewer asks for stories directly, structuring different the happenings recounted into coherent reports. The listening and recording cannot interrupt the answering process, but interactions can occasionally occur if the interviewer requires clarification or assistance. In this research, the interviews were performed by asking one question, taking notes and exchanging knowledge through the process. The observations on board were also captured in notes. Those, together with the interviews, were the base for the FRAM modelling of the work inside of a doghouse.

4.1 Description of the field study: drilling unit (doghouse) operations

Being away 400 km from the shore, working day and night in an environment made of iron and bolts, under sun, rain and waves, with drilling mud, noise, vibration, grease, oil and gas all over equipment, floors and bodies—this is the labour scenario of the drillers. The main personnel responsible for the drilling operations are the driller, the tool-pusher and the drilling supervisor, who are in the sharp end and work together with other crew members and the ship crew, mainly the ones responsible for the oil rig positioning and the captain (Strand and Lundteigen 2016). The interactions between workers, systems, and equipment are very intense in the drilling unit, the focus of this paper, and the drilling floor, where the drill itself is in action. According to Ramzali et al. (2015), drilling is a key part of the oil and gas system. Success in drilling activities will depend on their ability to substantially improve the operational reliability and availability of this process. The upstream sector of the oil and gas industry, which is called ‘exploration and production’, includes all oil and gas drilling activities and accounts for a higher critical injury incident rate than any other domain in the petroleum industry.

The drilling unit, also called the doghouse, is the workstation where the driller performs a series of activities to effectively drill the well hole, which is divided into four phases: the conductor phase, the surface phase, intermediate phases (as many as needed) and production phase. Inside of the doghouse, the driller is responsible for monitoring, controlling, drilling, exchanging, replacing, observing, communicating and safely stopping operations during emergency situations. This series of activities is crucial for the construction of the offshore oil well and fall under the purview one person inside of the doghouse, who performs it by dividing his attention and managing the necessary priorities, developing skills that are not exactly measurable, and are often called non-technical skills (Sneddon et al. 2006). An example of an oil rig is presented in Fig. 2. The doghouse is the number #32 of this scheme. Inside of a doghouse, a single worker (a driller) manages the entire drilling operation, sharing his attention; interacting with systems, controls and other workers; and responding to unplanned situations and occurrences.

Fig. 2
figure 2

Source: Armstrong, 1941

Project diagram of an offshore oil drilling rig.

The driller must maintain control of all operations inside and outside of the doghouse. Outside, on the drill floor, he needs to keep eye contact and clear communication during his entire work shift. There are also other crews responsible for the rig’s dynamic positioning system (DPS) and drilling fluid (pump room) who have to keep clear channels with the driller. If an oil rig could be compared to a human body, undoubtedly the driller would be the heart, the main organ of all systems. Ramzali et al. (2015), Hinton et al. (2018) and França et al. (2019) have noted that of all drilling operations roles, the driller’s is the most critical of all, because he is responsible for controlling, monitoring, maintaining and communicating all drilling steps, for his entire work shift. Non-technical skills like situational awareness, leadership and communication are essential to perform his job, giving him the ability to manage different actions in different scenarios. Furthermore, it is important to notice that all unexpected issues, like an emergency stop or drilling stop, are also his responsibility, demanding singular non-technical skills from him, which arise from his own experience and individual capabilities. Different situations and emergency scenarios require a variety of responses that automated complex systems are sometimes unable to produce. In this sense, the variability of the human being, of the worker, is a natural response to the demands for variability that complex socio-technical systems require daily. In Fig. 3, it is possible to see an example of driller performance inside the doghouse on an oil rig operating in the offshore area of the Campos Basin, Brazil.

Fig. 3
figure 3

Source: Authors, 2020

Driller inside of the doghouse of an oil rig, Brazil offshore area.

4.2 Interviews, onboard observations and data gathering

Having as basic premise the understanding of how the work of the driller happens, the researcher set out to gather as much information as possible, with as much veracity as possible. Interviews, data collection and onboard observations, all in situ, were carried out over a 6-month period, following these planned methodological stages and principles:

  1. 1.

    One onboard observation per month, without interviews. The goal was only to observe what actually happened inside the doghouse.

  2. 2.

    The interviews were not structured by long questionnaires or time limits. The only question the drillers were asked was 'How do you perform your job?'

  3. 3.

    The data collected from interviews and observation were the base for the FRAM modelling of the doghouse operations, with the researcher taking the activities of the driller as the guideline for this analysis.

  4. 4.

    Once the FRAM model was ready, it was validated with the drillers who contributed, with attention being given to verifying the time and precision variability of the main and most critical function outputs.

  5. 5.

    Once the FRAM model was fully validated by the drillers who contributed, it was also validated with FRAM specialists in Brazil and Europe, to guarantee an adequate use of the method.

  6. 6.

    If the FRAM specialists proposed significant changes, the researcher returned to drillers and repeated the entire validation cycle with drillers and FRAM specialists until the complete validation was received from both.

  7. 7.

    Once the FRAM model is fully validated, the modelling phase ended, and the work analysis began.

Despite all the preparation and planning, the cycle of interviews, observations and data collection did not develop exactly as planned, so it was necessary to redesign and make some adaptations. For example, due to onboard restrictions imposed by the companies, the 6-month period was, in fact, 11 months, in which

  • Four onboard observations were done, resulting in dozens of pages of notes, drawings and schemes. A few photos, authorised by company leaders, were also taken.

  • The one-question interviews ('How do you perform your job?') were performed onboard and on land with drillers who had just disembarked from their 15-day work shifts. Sixteen drillers were interviewed, eleven on land and five onboard.

The other steps of the planned methodology occurred as designed. The FRAM model was validated by five drillers who participated in the interviews, as well as by FRAM experts in Brazil and Europe. The FRAM model of the operating the drilling unit (doghouse) was built with 26 functions, each with variabilities that showed how critical the drillers' work was.

4.3 FRAM model of the drilling unit (Doghouse) operations

The interviews, onboard observations and data-gathering provided information to build a FRAM model capable of representing the real work done by drillers inside the doghouses on offshore oil rigs. Several activities performed by the drillers were, indeed, the FRAM functions of the model, which are part of this methodology and reflect the interaction with all the complex sociotechnical systems. They presented variabilities in terms of time and precision, which resulted in other instantiations for different variabilities. Some of these functions were characterised as foreground functions or background functions according to their relevance. These activities that described the functions of the FRAM model of the drillers’ activities inside a doghouse were

  • Operating the drilling unit (doghouse);

  • Control the drilling depth;

  • Control the drilling speed;

  • Control the drilling fluid (mud) pressure;

  • Drill a section of an offshore oil well;

  • Use the joystick to control drill depth and speed;

  • Monitor pressure instruments and screens;

  • Monitor the column weight of drilling;

  • Control parameters to proceed with the drilling;

  • Maintain real-time talk with the drilling floor crew;

  • Have pressure from supervision;

  • Have trained and certificated drillers;

  • Stop the drilling to insert new pipes (joints);

  • Manage a new insertion of pipes (joints);

  • Keep awareness of the drilling floor activities;

  • Have intercommunication equipment ready;

  • Control the drilling fluid (mud) pumps;

  • Monitor the level of trip tank;

  • Monitor the torque of the drill;

  • Have a drilling program;

  • Have a new shift of drilling operators;

  • Manage drilling malfunctions due wear and tear;

  • Execute an emergency stop;

  • Recognize and manage relevant external noises;

  • Recognize and manage relevant equipment vibration;

  • Recognize and manage relevant smell of hydrocarbons.

Of these 26 functions, the model revealed 7 were background functions, and 19 were foreground functions. However, despite these classifications, some of the background functions were shown to be crucial functions. The background functions also highlighted the importance of the non-technical skills for the driller, especially in critical and emergency situations. As can be seen in Fig. 4, some functions are highlighted in colours, a feature of the FMV® software (Hill 2018), used in this model to show the importance of some functions.

Fig. 4
figure 4

Source: Authors, 2020

FRAM model of the drilling unit (doghouse) operations.

The main function of this model is marked in green, to emphasise the importance of the human factors analysis from this starting point—the doghouse operations. The background functions that played some relevant role are marked in grey. The foreground functions that had only one variability (Time or Precision) are marked in light blue, while those that had both are marked in red due to their importance. Ultimately, the functions that represent the non-technical skills develop by the driller are marked in yellow. Although all functions are equally important for the construction of this FRAM model, some of these have a significant role within this complex socio-technical system. Those functions whose role is differentiated in the context of human factors and non-technical skills will be analysed in more detail.

4.4 Analysis and discussions of the relevant functions for human factors and non-technical skills

4.4.1 The ‘Operating the drilling unit (doghouse)’ function

Based on analysis of the model, the most relevant function, and the starting point for a human factors analysis, is ‘Operating the drilling unit (doghouse)’, a human function which has twelve different outputs, establishing couplings with all other functions of the model. Indeed, as the driller is responsible for all drilling actions inside of the doghouse, it seems obvious that all other functions of the model are influenced to some degree by that one. When it is analysed by resonance, this function is the centre of a web, producing a tiny vibration that affects the model. Table 1 shows all the outputs from this function, and Fig. 4 shows the function itself highlighted in green.

Table 1 Output of the function “Operating the drilling unit (doghouse)”

From this analysis, some questions can arise, such as, ‘If too much depends on just one man—a driller—which factors can influence his performance, positively or negatively?’ The answer is human factors. So, a human factor analysis is not only intended to see what is wrong and fix it; it must also identify what is going right, what is increasing the performance, and how to look for and emphasise those things in a scenario, as well as how to replicate them in others. In several moments, drillers deal with a flood of screen alarms while managing pressure from supervisors while giving appropriate (and productive) responses to system demands. In this sense, the FRAM is an adequate methodology, because it can identify the natural variabilities of human performance, and, from the identification of this variability, assist in the recognition of the human factors that are influencing workers. Bearing this in mind, and still analysing this function, we can see that non-technical skills, such as situational awareness and communication, have a primary role to play in the productive and safe execution of activities. Such skills are recognised and are already part of studies, both in the area of O&G, and in civil and military aviation, as noted in Dekker (2015) and Flin, O’ Connor and Crichton (2016). However, in this research, other non-technical skills that were previously not adequately known, specifically the recognition and management of relevant external noises, hydrocarbon smells and equipment vibrations, appear essential for the execution of the work.

4.4.2 The ‘Recognise and manage relevant external noises’, ‘Recognise and manage relevant hydrocarbon smells’ and ‘Recognise and manage relevant equipment vibration’ functions

The functions highlighted in yellow in the FRAM model play an important role in drillers’ activities. It is well known that some non-technical skills, like communication, situational awareness and leadership are important to the safe performance of many high-risk activities, such as piloting, mountaineering and offshore operations. This research also revealed some additional non-technical skills that help the drillers to perform their roles in a productive and safe way, even in emergency situations. These are the recognition and management of relevant external noises, hydrocarbon smells and equipment vibrations, and are represented in Fig. 5.

Fig. 5
figure 5

Source: Authors, 2020

Functions that represent some drillers’ non-technical skills.

Despite not being present in any procedures, good practices or standards, the recognition and management of relevant external noises, hydrocarbon smells and equipment vibrations are skills observed onboard, and reported by drillers as essential to performing their jobs. In one interview, a driller declared, ‘Dude… I was in the night, relaxed and contemplating the drill floor… then I felt that oil smell, which activated me and at the time I had to control the weight of the fluid… it was a matter of seconds… if I hadn't done it, if I hadn't gotten smart with the smell… it was going to suck… it was going to blowout’. Not only this driller, but others also narrated situations in which if it were not for their perception of the vibrations of the doghouse, or the smell of the oil from equipment leakages, or the recognition of some strange noises, something really could have gone wrong. As much as these perceptions from the drillers can be labelled as ‘situational awareness’, some researchers have been recognising that different professions demand different skills—technical or non-technical (Saldanha et al. 2020). The drillers have developed specific non-technical skills to deal with the specific demands of their activities inside of the doghouse, using the natural variability of their human perceptions positively. The situational awareness construct considers humans’ perceptions, memory structures, mental models and attention, especially in decision-making situations in the cognitive psychology and human factors fields, and particularly considering the role of goals and goal-directed processing in directing attention and interpreting the significance of perceived information (Endsley 2015).

4.4.3 The ‘Keep awareness of the drilling floor activities’ function

Maintaining constant awareness is one of the most important activities performed by the driller, although he is sitting in the doghouse almost all the time. It is through the doghouse's observation windows, extremely wide and well positioned, as shown in Fig. 3, that the driller maintains continuous awareness with the drill floor. Especially at times when drilling is stopped to insert new pipes (joints), drill bits are replaced or there are emergency stops, the situation awareness is very important, not only for performing the job properly but also for performing it safely. Figure 6 shows this function and its couplings.

Fig. 6
figure 6

Source: Authors, 2020

Function “Keep awareness of the drilling floor activities”.

According to Dekker (2015), situational awareness is regarded as a causal construct that exists in the mind of a human operator and that is very relevant to safety. But this relevance is not related to blaming or explaining accidents, such as mishaps in aviation and other settings. It is much more than that. Situational awareness gives a real perception of what is happening, at the exact moment when it is happening, allowing the human brain, through its past experiences and survival genes acquired over years of evolution, to give an accurate response to the prominent risk. During the evolution of humanity, humans successfully colonised all the continents, and their adaptations to local environments, as well as risks, resulted in the development of genes that in the distant past saved the species from natural hazards (Roberts 2018), and today enable workers to have risk perception and situational awareness. The individual risk perception of the drillers is a crucial characteristic that is intrinsically connected to their behaviour and, consequently, their actions in daily routine and emergency situations (Salas and Dietz 2011). Risk perception is, therefore, a key element for safety.

So, analysing the output variability of this function, the fact of its being Time: Too early and Precision: Acceptable, does not signal something wrong. Quite the opposite: as mentioned by the drillers in the interviews, they always must predict risk situations on the drill floor, anticipating their own movements and actions, to prevent something wrong from happening inside or outside of the doghouse. In addition, they also reported that they continuously adapt to the situations that happen, giving the appropriate response, which sometimes means not precisely following procedures precisely, but rather doing what it is safer and more productive. Thus, this variability of human responses, which is reflected in the most diverse non-technical skills, in particular situational awareness, is not exactly a problem. In fact, it is a solution, because it brings variability to different risk situations, giving an adequate response to each possibility of a dangerous scenario, mainly in complex socio-technical systems.

4.4.4 The ‘Have a new shift of drilling operators’ function

The ‘Have a new shift of drilling operators’ presented output variabilities both in terms of time and precision. Based on interview responses and the on-board observations, these variabilities are definitely present, but do not have a substantial negative impact on daily operations, particularly because the drillers manage their own variabilities successfully. In terms of time, the output is too late mainly because there are almost always some delays when changing work shifts, due individual issues from drillers, or imposed by organisational circumstances, like mandatory doubled shifts or ship abandonment simulations demanded by the regulatory agency. Environmental issues like storms or intense wind also can cause delays in the shift routine of the drillers. Figure 7 presents this function and its couplings.

Fig. 7
figure 7

Source: Authors, 2020

Function “Have a new shift of drilling operators”.

Regarding precision, the function output is acceptable, because, due to time restrictions and multiple tasks it is necessary to talk about only the essential and critical information during shift changes, leaving aside day-to-day information, related to the operation. Such information is not crucial to the operation or safety but is part of the operating procedures. Thus, a dilemma becomes clear for the driller: to comply strictly with the rules and procedures, or to be efficient and do the work productively and quickly. This dichotomous dilemma is contemplated by the ETTO Principle (Hollnagel 2009), under which the worker, on a daily basis, has to balance between being extremely productive (efficiency) or extremely safe (thoroughness). In the onboard observations, it was verified that the drillers naturally make this transition, acting conservative when the communication is flawed, or acting more productively when communication is full and effective.

4.4.5 The ‘Manage drilling malfunctions due wear and tear’ function

The three oil rigs involved in this research were assembled in the 80 s and 90 s, and are therefore vessels that can be considered outdated. Given the regular aging processes of their equipment, there are many parameters that can affect safety, like influences of process materials, design, corrosion, operating conditions, maintenance and intense weather (Pasman et al. 2017). Even during normal operations, drillers have to deal with equipment failures caused by natural wear, which are accentuated by those parameters. Figure 8 shows this function and its couplings.

Fig. 8
figure 8

Source: Authors, 2020

Function “Manage drilling malfunctions due wear and tear”.

The output variabilities of this function are Time: Too Late and Precision: Imprecise, because, as reported by the drillers, there is a considerable time between a maintenance request and the actual execution of the maintenance. Sometimes, it is only after a catastrophic equipment failure (e.g. rupture of a drill line due to wear and tear) that maintenance is promptly carried out. In terms of Precision is imprecise, because the result of maintenance is a temporary patch, which is permanent until there is a catastrophic failure. In very rare cases, maintenance is carried out properly, and is usually associated with some inspection by the Regulatory Agency or Ministry of Labour. During one of the onboard observation, a malfunction in the goose-neck, a flexible and sturdy hose that interconnects mud pumps to the top drive of the drilling derrick, caused a large leakage of drilling fluid over all drill floor and in part of the doghouse, requiring a controlled stop of all drilling operations. Additionally, severe weather conditions increase wear and tear failures, requiring more attention from drillers, based on their risk perception and performance.

Furthermore, in most of the interviews, the drillers reported malfunctions on a variety of equipment, such as drill bits breaking in the first moments of drilling, drill pipes (joints) smashing, corrosion and falling parts on the drilling derrick and collapsing rotary tables, among others. Offshore accidents in the O&G industry have shown that maintenance failures that become substantial ones are key elements that feed the accidental chain of major catastrophes at sea. The explosion and sinking of the oil rig Deepwater Horizon (2010), in the Gulf of Mexico, resulted in the tragic deaths of eleven workers and the catastrophic leak of 4.9 million barrels of oil for 87 consecutive days. Desk phone scattering, mud pump malfunctions and hydrocarbons sensor failures were maintenance faults that made a significant contribution for this disaster (Lustgarten 2012). For the offshore O&G industry, maintenance is not just another necessary item for daily operations; it is a matter of safety, and of avoiding accidents, fatalities and catastrophes.

4.4.6 The ‘Stop the drilling to insert new pipes (joints)’ function

Basically, all the drillers interviewed were unanimous in stating that, of all the activities they performed, stopping the drilling to insert new pipes (joints) is the one that involves the most attention, interaction and expertise, because a failure in this operation, releases forces and pressures that are capable of destroying the entire drill floor and doghouse at once. The output variability of this Function is Time: Too late and Precision: Acceptable, as the drillers reported that they have to adjust operations to safely stop the drilling, which occurs in acceptable conditions most of the time. Small adaptations, like controlling the drilling fluid or turning off pumps, or critical ones, like forcing the rotary table to assist the pipe decoupling, are example of imperfect conditions that commonly occur for the work to be done. Regarding time, due to the multitasking performance demanded by the doghouse activities, as seen in Fig. 3, it is quite common that the insertion of new pipes does not occur at the time planned in the drilling program, but around the time. Figure 9 provides a view of this Function and its couplings.

Fig. 9
figure 9

Source: Authors, 2020

Function “Stop the drilling to insert new pipes (joints)”.

Despite all these criticality issues in this function, events and accidents involving the insertion of new pipes are common, causing loss of equipment, major leaks and fatalities. Recently, an oil rig accidentally cut off its own drill pipe while operating off the South Island coast of New Zealand, in the Tasman Sea. The incident occurred in January 2020, when a blowout preventer—a safety device used to seal drill pipes and prevent hydrocarbons release—was mistakenly activated, just at the time the driller was stopping the drilling to insert new pipes (Morris 2020). From the onboard observations, it was possible to notice that it is indeed a tense operation, with the workers inside and outside of the doghouse very focused in their tasks, and only engaging in essential communication.

The output of this function—drilling stopped for a new insertion of pipes (joints)—is coupled into the input of the ‘Manage a new insertion of pipes (joints)’ function, and the variability of this output can severely alter the management of a new insertion because the resonance through ‘Manage a new insertion of pipes (joints)’ directly affects, at the same time, the drilling of a new section of the well and the execution of an emergency stop. While the drilling of a new section affects production more than the occurrence of an accident, the execution of an emergency stop is an accident itself, causing the driller to experience situations that may be out of control. Therefore, the resonance from the “Manage a new insertion of pipes (joints)” function is intrinsically connected to the production and safety of the entire oil rig, since the objective of the work is precisely to continue drilling safely. The stop for an insertion of new pipes is, in fact, critical and requires a degree of attention that translates into the variability of the output of the function that deals with this action.

Dichotomously, the variability that can feed a chain of accidental events is the same variability that has conditions to provide complex responses to the complex demands of an accident in this system, leading to a resilient state where it continues to function or stops with minimal losses. In this context, it is possible to see that the variabilities’ output of the function ‘Stop the drilling to insert new pipes (joints)’ is, at the same time, the reason why the main part of the system works and critical to failures that lead to accidents, making this activity an object of attention for the safety of the oil rig. In this workplace, the origin of accidents and productive work can be the same, and depending on how the variability happens, it will be the result that will determine if it was an undesirable event (the accident) or a desirable one (increasing performance or maintaining production.

Two of the sixteen drillers interviewed reported situations in this operation that depended on their non-technical skills to promote safety. The first one declared that, during the regular stopping of the drill pipes, he heard a strange noise, something like the scratching of a fork on a metal surface. This noise, which was unlike what he usually heard, caught his attention, and he realised that the kelly drive was behaving erratically, with wavy movements. He immediately gave the call to clear the drill floor, and few moments later, the swivel and kelly drive collapsed, falling and destroying most part of the drill floor. In this case, it is possible to realise that the function ‘Recognise and manage relevant external noises’ shown in Fig. 5, while not present in drilling procedures or good practices, is a critical non-technical skill for drillers’ performance.

The second one declared that, during night shift, drilling a short section of rocks without hydrocarbons, he started to feel a different vibration in his hand, coming from the joystick and the chair. He declared: ‘I felt this weird vibration, and then I thought: It is the f… gas coming… I have to put this down… then I smelled the gas and got that. At the time, I increased the fluid weight, reduced the speed, raised the bit a little, reduced the torque, asked the pump guys to check the level and told the roughnecks to get out of there!’ The feeling of the vibration in the equipment, as well as the smell of hydrocarbons, was crucial a non-technical skill that gave him total control of the situation, providing adequate management of the emerged risks. In other words, in complex sociotechnical systems, such as doghouse operations, only human variability is able to provide the necessary responses to ensure the productive and safe functioning of the system. In this second case, it is possible to see that the functions ‘Recognise and manage relevant hydrocarbon smells’ and ‘Recognise and manage relevant equipment vibration”, both presented in Fig. 5 helped the driller control an unpredictable situation, avoid incidents and adjust the drilling operations for this new set of conditions. Those are essential non-technical skills for drillers’ performance.

5 Conclusions

The FRAM in this research successfully structured the recognition of the human factors and non-technical skills involved in the work routine of offshore drillers. Its modelling was built and validated not only by drillers but also by FRAM specialists. Besides the well-known non-technical skills presented by the literature, such as situational awareness, communication and leadership, the drillers have developed specific non-technical skills to deal with the vicissitudes and specific demands of their work, using the natural variability of their behaviour positively, which makes the correlation between WAI and WAD happen naturally. The recognition and management of relevant external noises, hydrocarbon smells and equipment vibrations are skills observed onboard, and reported by drillers as essential to the performance of their jobs. These non-technical skills were developed by them, and are the result of their adaptation and integration into a complex sociotechnical system—the oil rig.

In addition, the FRAM analysis showed that this natural variability of the workers reflected in the variabilities of the functions’ output is not exactly a problem, but a solution. It can deal with the everyday variabilities of complex sociotechnical systems, providing the most suitable answer to the system’s demands, both in normal operations and in emergency situations. This FRAM modelling showed that the variability can also be something positive; in fact, in emergency situations, where multiple scenarios happen at the same time, this variability is the only reasonable solution. Particularly in the analysis of the function ‘Operating the drilling unit (doghouse)’, where is clear that too much depends on just one driller inside the doghouse, the human factor recognition performed by FRAM showed that at the same time, the drillers have to deal with twelve outputs—their daily activities. Non-technical skills such as situational awareness and communication play an important role in the effective and safe performance of these activities. Therefore, understanding and recognising all factors that can influence human performance, whether technological, environmental, organisational or individual, as well as the interaction between these, are vital. In other words, the human factors are essential not only to avoid accidents but also to promoting safe and productive operations. It is important to note that severe weather conditions (environmental), floods of screen alarms (technological), pressure from supervisors (organisational) and risk perception (individual) are, indeed, human factors present in the offshore operations and can definitely influence the performance of the drillers.

Ultimately, the FRAM is an effective methodology because it can identify the natural variabilities of human performance, and from this, assist in the recognition and analysis of the human factors that have influence over the workers, whether those are well-known non-technical skills or the niche ones that naturally arise from human interaction with complex systems. The intrinsic variability of human nature makes safe work possible. Through FRAM, the natural variability of the worker, seeking to equalise the WAI and WAD, it is not only possible to identify and analyse critical system elements, but also to adequately replicate them. In many cases, the variabilities performed by the workers were identified as the key elements for the real work or, in other words, the reason things go right, and the work is done. People, workers, and their intrinsic variability are the solution, not the problem.