Introduction

Tailings are the solid material remaining after the returnable minerals have been extracted from the mined ore plus varying degrees of runoff and process water (waste). The physical and chemical properties of tailings vary due to the nature of the ore, geological background, and the climate in which the tailings are located and are also related to the beneficiation process. Tailings are usually stored in surface facilities, which together with some of the structures that maintain them make up one of the largest industrial systems in the world and are also important interference areas in mining operations. In general, the basic function of TIS is still to store waste. Therefore, in a period of rapid national economic and social development, there is a tendency to minimize the cost of such huge TIS the main principle of relevant decisions and actions. Insufficient consideration is often given to the long-term stability of TIS or the possible safety and environmental contingencies and their negative impacts. Several major failures of TIS have proven them to be one of the major sources of risk to mining engineering and production safety, posing a serious threat to the companies themselves and the communities involved.

Over the past 50 years, at least 63 major TIS failures have been reported worldwide [1], and the total number of TIS failures per year has been declining since 1990. However, failures with serious consequences have been on the rise (see Fig. 1). The WISE Uranium Project [2] estimates that between 1961 and 2019, at least 2,375 people worldwide have been killed by TIS disasters. Table 1 documents some of the major TIS failures that occurred globally in the decade 2010–2020, which directly or indirectly affected tens of millions of people.

Fig. 1
figure 1

TIS failures per 5 years

Table 1 TIS failure cases and their impact in 2019~2020

The absence of public, timely, multi-scalar information about the interaction between personnel, management, environmental and technological factors is a normalized feature in the failure analysis of disaster in TIS. Regardless of growing knowledge and lesson gained from both research and the past experience, major and catastrophic TIS accidents still continue to occur. Historical data show that TIS’ major failures with catastrophic consequences occur at a rate that remains too high. Various official accident analysis reports usually show that the majority of such TIS accidents were preventable. This situation indicates that there is still needed to further understand phenomena and causes leading to major TIS failure.

The failure of TIS may occur in all stages of life cycle. In many cases of failure, the consequences of TIS damage are terrible. On September 8, 2008, a particularly serious dam failure occurred in the Tashan iron ore mine TIS in Xianfen County, Shanxi Province, China. Due to Xinta Company's illegal construction and production, the slope of the tailings accumulation dam was too steep. At the same time, the wrong technical methods, such as laying plastic waterproof film in the reservoir to prevent the tailing water from infiltration and loess sticking slope to prevent the dam water from infiltration, lead to the local seepage failure of the dam body and cause the dam body in the limit state to lose balance and slide, resulting in the dam break. TIS in Xianfen has leaked about 190,000 m3 of tailings which have flooded densely populated areas just 50 meters downstream, such as office buildings, farmers' markets, and residential areas and has resulted in at least 277 deaths and 33 injuries. On Oct 4, 2010, approximately 1 million m3 of the residue was released into the environment from the aluminum plant Ajkai Timfoldgyar Zrt in Western Hungary. Hundreds of houses were destroyed, 265 individuals were injured and ten died [3]. The catastrophic collapse of the tailings dam at the Córrego do Feijão iron ore mine in the town of Brumadinho, Minas Gerais, Brazil, in January 2019, has resulted in more than 230 confirmed deaths by April 2019 [4]. Only 4 years prior, in the same state, the Bento Rodrigues tailings dam at the same mine owner's Samarco mine failed, causing in the death of at least 17 people [5]. These catastrophic TIS failure has brought a renewed sense of urgency to the entire industry, civil society, and investor community.

The occurrence of TIS failures is usually very random, and the evolutionary pattern of failures (e.g., incubation, causation, and development) is highly unique. Although most of them are uncertain events, the probability and statistical characteristics of the events are unreliable. Preventing catastrophic TIS failure and achieving zero deaths is the goal that the mining industry has been pursuing. Therefore, it is important to consider the influencing factors such as personnel behavior, process technology, and operation practice for each stage of the TIS life cycle. Using evidence-based thinking, it is of unique theoretical significance and practical value to establish a differentiated characterization of TIS failure analysis and prevention.

Background

Considering that the service conditions of a TIS are constantly changing, so the life cycle of a TIS is also a continuous process of change. During which the technical processes and personnel procedures of the TIS are in constant change, and accordingly, these changes can bring about potential failures and risks to production safety. Fault tree analysis (FTA), a top-down deductive failure analysis method, is also a traditional linear analysis model [6]. The aim of the FTA is to use deductive logic to understand all the underlying causes of a particular failure in a sufficiently complex system so that the likelihood of failure can be reduced through improved system design. What is more, FTA is a diagram deductive method and a logic reasoning method in certain conditions in view of accidents. It can analyze a particular accident deeply, express the inherent connection and point out the logical relationship between fault units. Therefore, FTA is mainly used in the field of reliability and safety engineering.

In the past few decades, more than a dozen different accident analysis methods have been proposed in the field of safety science. These methods have a positive effect on the safety and reliability of various industrial systems. At the same time, with the increasing complexity of modern industrial systems, more general and convenient failure analysis and prevention techniques are required. However, it is impractical to use any single technique to analyze all aspects of an industrial system. This requires the development of a combination of different accident analysis methods in order to achieve synergy among the advantages of a single method. This paper summarizes the goals, methods, and attempts to improve the seven most used accident analysis techniques, including failure mode and effect analysis (FMEA), Failure modes effects and criticality analysis (FMECA), Bayesian network and fuzzy set theory, etc., as shown in Table 2.

Table 2 Improved accident analysis technology

Peeters et al. [14] demonstrated the effectiveness of this method for failure analysis of complex industrial equipment by combining fault tree analysis (FTA) and failure mode and effects analysis (FMEA) in a recursive manner to improve the efficiency of failure analysis. Chengyuan Zhu et al. [15] conducted a dynamic analysis of the key factors of laboratory explosions by combining qualitative fault tree (FTA) analysis and quantitative binary decision diagram (BDD) analysis. Xue Lei et al. [16] used a relatively new dynamic fault tree analysis model in reliability analysis to model the complex interactions between different types of suppliers. Kirk Shanks et al. [10] applied FMECA and FTA to identify key issues that affect long-term performance. Hasan Bas et al [17] apply fault tree analysis to check for printing errors in 3D printers. Fault trees still have some degree of application in many other diverse fields, including the nuclear industry [18], railroads [19, 20] and mining [21,22,23].

Fault Analysis

Focusing on the life cycle of TIS, this article adopts a method based on documentary evidence and analyzes the technical standards, operation procedures, global TIS failure cases, and Riskgate [24] website database, etc., to divide the faults of TIS into five categories: dam collapse, overtopping, seepage, transmission leakage, and dust emission. Then, the dynamic analysis of the factors influencing these five types of faults and the evolution of the hidden hazards was carried out, the basic cause events/dormant hidden hazards and intermediate events/coupled hidden hazards were identified, a full list of hidden hazards was given, and a fault tree characterizing the relationship between hidden hazards was constructed. In total, there are 202 hazards and influencing factors, including 48 dam collapse hazards (including 32 basic events/factors and 16 intermediate events/coupling events), and 53 overtopping hazards (including 37 basic events/factors and 16 intermediate events/coupling events). There are 51 seepage hazards (including 36 basic events/factors and 15 intermediate events/coupling events), 17 transmission leakage hazards (including 10 basic events/factors and 7 intermediate events/coupling events), and 33 dust emission hazards (including 19 basic events/factors and 14 intermediate events/coupling events).

Further analysis of the data shows that the distribution of these failure hazards over the life cycle stages of the TIS is shown in Fig. 2.

Fig. 2
figure 2

TIS hazards over its life cycle

From the point of view of failure rate, the long-term production safety of the tailings industry system can be called failure-prone in comparison with the safety of other industrial sectors, but the factors leading to failure and the causes of failure are different and unpredictable. Similarly, compared with other industrial systems, each fault and its causes of TIS also show greater contingency and uniqueness. The construction of high-volume, high-risk mine TIS will likely continue in the future against the backdrop of increased global demand for metals, declining ore grades, and the associated increase in mine tailings [24]. This means that we are facing an unprecedented sense of urgency and risk from TIS.

Based on the results of the fault tree analysis, relationships and patterns can be drawn between the influencing factors, potential hazard formation, and evolution processes, and potential consequences regarding the failure of the TIS and its risks, specifically including the following main aspects:

  1. 1.

    The number of basic events or dormant hazards of the dam collapse, overtopping, seepage, transmission leakage, and dust emission are 32, 37, 36, 10, and 19, respectively, which are involving improper personnel design, construction personnel errors, staff errors, maintenance personnel errors and equipment failure, etc., as shown in Tables 3, 4, 5, 6 and 7.

  2. 2.

    The number of the minimal cut sets in fault tree (failure combinations of basic events or dormant hazards) of the dam collapse, overtopping, seepage, transmission leakage, and dust emission are 29, 34, 33, 9, and 17, respectively. In terms of the dam collapse, there are 29 possible evolution paths. Once the basic events, dormant hazards, or coupling conditions in the 29 minimal cut sets occur, they can lead to TIS failure.

    According to Figures 3, 4, 5, 6 and 7, the high number of cut sets containing a single event means that once a basic event or a dormant hidden problem appears, it may directly lead to a failure.

  3. 3.

    The number of the minimal path sets in fault tree (minimal combination of basic events or dormant hazards that prevent the top event from happening) of the dam collapse, overtopping, seepage, transmission leakage, and dust emission are 8, 8, 8, 2, and 4, respectively. There are eight comprehensive prevention and control measures in dam collapse, and dam collapse can only be avoided by ensuring that all the basic events in each minimal path set do not occur, respectively.

Table 3 Factors and hazards causing TIS dam collapse
Table 4 Factors and hazards causing TIS overtopping
Table 5 Factors and hazards causing TIS seepage
Table 6 Factors and hazards causing TIS transmission leakage
Table 7 Factors and hazards causing TIS dust emission
Fig. 3
figure 3

Fault tree analysis of TIS dam collapse

Fig. 4
figure 4

Fault tree analysis of TIS overtopping

Fig. 5
figure 5

Fault tree analysis of TIS seepage

Fig. 6
figure 6

Fault tree analysis of TIS transmission leakage

Fig. 7
figure 7

Fault tree analysis of TIS dust emission

According to the results of the analysis, since each minimal path set contains more basic events or dormant hazards, it is necessary to prevent multiple basic events or dormant hazards at the same time to effectively avoid failures.

FTA for Dam Collapse

Dam collapse is the most devastating of these five types of TIS failure. Once it occurs, it is usually accompanied by secondary geological hazards, such as mudslides. At present, major parts of the existing tailings industrial system infrastructure in most countries are nearing the end of their service life, and relevant staff had little understanding of the seriousness of the risk of dam failure and were negligent in daily operation and maintenance work, so the risk of dam failure is greater. Dam collapse is also mostly since that the volume of the TIS is unable to withstand a large amount of sediment and wastewater so that the water level in the dam exceeds the height of the dam wall itself. Many tailings with high potential energy are finally dumped directly into the outer area of the dam.

According to the existing accident statistics, most dam-break accidents mainly occur in the operation phase. If extreme rainfall weather occurs, surface runoff cannot remove rainwater in time, all of which will be a huge test for the carrying capacity of the TIS. It is easy to see that extreme rainfall events must be fully considered in the life cycle of TIS. This natural meteorological factor affects each stage of the TIS, must do the appropriate protective measures, and timely scientific forecasts. For other basic events or dormant hazards, most cases can be controlled by human intervention and resolved by taking appropriate measures. Therefore, it is necessary to establish a long-term vision when designing and constructing TIS, considering both the hydrological climate of the TIS location and strictly controlling the construction quality of TIS construction.

FTA for Overtopping

Overtopping is a high incidence failure. It mostly appears in the season of concentrated rainfall. With rainwater intrusion, the spillway cannot remove the rainwater in time, resulting in the storage capacity of the TIS cannot bear a large number of tailings and rainwater at the same time. The water level in the pond will exceed the allowable design height of overtopping, and then the failure will occur. If the TIS is located in a mountainous area, it can also lead to secondary disasters such as landslides and soil and water pollution.

The overtopping failure seriously threatens the life safety of the surrounding personnel. Most of the existing tailings dam are of granular structure, and the overtopping failure will lead to a breach in the dam body, which can eventually lead to a dam collapse failure. Similarly, overtopping will test the reliability of drainage facilities. This puts forward a high requirement for the design of TIS, and the extreme rainfall weather should be taken into full consideration. In addition, an emergency plan should be prepared before overtopping failure. Assuming that the drainage facilities are working normally, the water level in the reservoir is still at a dangerous height, due to continuous heavy rainfall. How to deal with this high-level water phenomenon for the first time is a serious problem for the relevant personnel. This requires the enhancement of personnel's awareness of prevention and the ability to perceive risk.

FTA for Seepage

Seepage failure in TIS has important influencing factors throughout the life cycle stage. In the initial design phase, the topographic and geological conditions of the TIS determine the amount of impermeability. For instance, the geological conditions of the karst and loess areas have an extremely negative impact on the seepage of TIS. Karst geological conditions because of cave development, which can cause natural leakage channels. Geological conditions of the loess Area are due to the large fissures and joint development of the loess.

To prevent TIS seepage failure, natural factors must first be given full preparation. At the same time, more attention should be paid to earthquake and heavy rainfall events, and the safety awareness of relevant personnel of TIS should be cultivated.

FTA for Transmission Leakage

Transmission leakage mainly occurs in the TIS operation stages. The occurrence of such failures can pose a serious threat to the surrounding residential and ecological environment. The reason is the tailings transported by pipeline mostly contain harmful chemicals from mineral processing.

To effectively avoid the occurrence of TIS transmission leakage failure, more emphasis should be put on the design phase. For the complete conveying system pipeline selection to achieve reasonable rigorous. During the daily operation of the TIS, the safety supervision of pipeline maintenance personnel should be increased to ensure the normal operation of all aspects of TIS pipeline transportation.

FTA for Dust Emission

Most dust emission in the reservoir area occur in dry weather. If the tailings particle size is small, it is very easy to solidify on the dry beach and accompanied by windy climate, often easy to cause the occurrence of dust failure in the reservoir area. The occurrence of such failures can seriously threaten the health of residents, as high winds can carry harmful chemicals such as tailings sand, causing lung diseases. For the occurrence of dust emission failure, it is very important to do the early design work well. Similarly, the physical and chemical properties of tailings should also be understood in detail. What is more, for windy climates, it is necessary to take appropriate protective measures and try to avoid the exposure of large amounts of tailings.

Prevention Framework

Industrial failures have shown that the performance of highly complex socio-technical systems depends on the interaction of technical, human, social, organizational, managerial, and environmental factors. Risk is an emerging property of complex socio-technical systems. It is influenced by the decisions of all participants in the system. For this study, TIS failure risk is typically caused by multiple contributing factors at all levels of the organization, not just a single catastrophic decision or action. In the TIS failure prevention framework, it is necessary to consider the operating environment, system functions and their interactions at the system level while gaining insight into the possible mechanisms of hazard propagation and human-machine interaction.

Failure Probability Assessment

Quantitative risk analysis methods generally recommend assessing the likelihood of a major failure and determining from that assessment whether the risk is acceptable. But with a view to the whole life cycle of a TIS, such a probability calculation is not an easy task. In some TIS risk probability assessments, the relevant probability calculation can only select a very random frequency range, and the existing safety level of the TIS is not considered. The use of these probability values is also not representative of the efforts made by mining companies in terms of TIS failure prevention, mitigation, and safety management systems.

Accidental Risk Assessment Methodology for Industries (ARAMIS) [25] is an alternative method. The approach focuses on security systems for industrial facilities. The principle of this method is to reduce the frequency of failures by starting from the frequency of deep causes of failures and considering the possibility of failure of safety functions performed in the field. According to the ARAMIS assumptions, the frequency (probability) of TIS failures is determined by two components: the frequency of the underlying event, i.e., the cause of the failure, and the functional reliability of the safety system to prevent such failures from occurring.

Estimating the Frequency of Basic Events

Obviously, in the field of TIS risk research, there is a lack of effective data support. According to published data, the frequency of basic events and the amount of data available for different types of initial events vary significantly.

To better estimate the frequency (probability) of the underlying events, it is recommended to use TIS specific data (based on the premise that data is available). Alternatively, with the help of the qualitative assessment in Table 8, an attempt was made to estimate the frequency of basic events in TIS.

Table 8 Qualitative and quantitative analysis of basic event

Determine the Confidence Level of the Security Function

To identify safety systems that have an impact on the occurrence of failures, the concepts of safety functions and safety barriers are introduced. The so-called security functions are technical or organizational behaviors, not objects or physical systems. In the TIS fault tree, the role of the safety function is to avoid, prevent, limit, or reduce the likelihood of an event occurring. Security features are the "elements" needed to ensure, increase, or promote security, and security barriers are the means to achieve them. For security barriers, it can be a physical or engineered system, or a human operation based on specific procedures or management controls. For TIS, safety barriers include several systems, such as tailings dam displacement monitoring equipment, water level monitoring equipment, etc.

According to the safety integrity level (SIL) concept provided in the IEC 61508 [26] standard, SIL indicates the level of confidence that the safety system will operate correctly and ensure an adequate response to any failure in the manufacturing plant or other systems, as shown in Table 9.

Calculate the Probability of Key Events

After estimating the characteristics of the underlying events, determining the safety barriers, and assessing their confidence levels, the TIS fault tree can be analyzed to calculate the associated critical event probabilities. This step starts with the basic events of the fault tree and considers the safety barriers on the fault tree, gradually approaching the calculation of critical events.

The main principles of calculating the critical event probability are as follows: If the confidence level of barriers on a fault tree branch is equal to n, the frequency of events upstream and downstream of the branch will be reduced by a factor of 10n. Therefore, we can calculate the frequency of various basic events in the TIS fault tree, as well as the frequency of key events considering the safety barrier. The result of the example is shown in Fig. 8.

Fig. 8
figure 8

Calculation of probability of key events in fault tree

Reference Failure Scenario Selection using a Two-dimensional Risk Matrix

One of the key steps in the failure analysis process is the selection of reference failure scenarios, which are the assumptions considered for modeling the failure scenarios. Each reference failure scenario is defined by a basic event that triggers a serious failure, which may lead to different hazardous phenomena.

The selection of reference failure scenarios for TIS is based on an assessment of the frequency of hazardous phenomena and their potential consequences. Therefore, it is necessary to broadly assess the consequences of each TIS failure. Of course, only a qualitative assessment of the potential consequences is available, and subsequent severity calculations involve a quantitative assessment, but only after a reference failure scenario has been selected.

The qualitative assessment of the consequences of the failure is based on the four categories of consequences defined in Table 10. Define these categories based on the domino effect, the impact on human populations, and the potential consequences on environmental impacts.

Table 9 Definition of safety barrier confidence
Table 10 Consequence categories

Using a two-dimensional risk matrix, select the TIS failure scenarios that will be modeled in the event's own severity calculation. The X-axis in the two-dimensional risk matrix corresponds to the four consequence categories in Table 10, and the Y-axis corresponds to the frequency (in years) of various failures in the TIS in the existing case statistics. The following three regions are defined in this matrix: The "negligible impact" area on the lower left corresponds to TIS failure scenarios that are of sufficiently low frequency or consequence that they may not have a substantial impact on people. The "medium impact" area in the middle corresponds to the TIS failure phenomenon, which may have a substantial impact on people. The "high impact" area in the upper right corresponds to the very dangerous phenomenon of TIS failure, which is bound to have a substantial impact on people. Therefore, the TIS failure scenarios corresponding to the "high impact" area should be focused on.

According to the results of the TIS fault tree analysis, the frequency and consequence categories of each TIS fault are placed in the risk matrix (see Fig. 9). Among them, the failures in the "Medium Effects" and "High Effects" areas must be modeled to calculate the severity.

Fig. 9
figure 9

Accident risk matrix of TIS

For reference failure scenarios (dam collapse, overtopping, and seepage), the establishment of a failure prevention framework requires a lot of key information, such as the characteristics of TIS (including water storage, dam height, operating time, etc.) and the characteristics of events in the fault tree (e.g., leakage diameter, release time, sensitivity to geological disasters, etc.)

Case Study

On January 25, 2019, the collapse of the Brumadinho tailings dam in Brazil killed 259 people and left 11 people unaccounted for when it rapidly released tailings (approximately 32 million cubic meters) flowing downstream at high speed (see Fig. 10). Many tailings are discharged into the river, which may pollute about 300 kilometers of rivers and severely damage the local ecosystem. The failure was considered an industrial, humanitarian, and environmental disaster, as well as a public catastrophe. It is considered the second-largest industrial disaster of the century and the largest occupational accident in Brazil [27].

Fig. 10
figure 10

Brumadinho dam collapse in Brazil (Photo by: Isac Nóbrega/PR)

Córrego do Feijão B1 tailings dam in Brumadinho, Brazil, is a tailings dam with a height of 86 meters, a top length of 720 meters, an area of 2.495 million square meters and a storage capacity of about 1.2 × 106 m3. It is located 9 kilometers northeast of the town of Brumadinho in Minas Gerais, Brazil. The dam adopts the upstream type tailings damming method. According to the Google satellite map, it is estimated that the distance between the nearest construction facility downstream and the TIS is less than 300 m [28].

An internal report from Vale (the owner of the Brumadinho dam) knew that the dam was unsafe as early as 2003. The report included unsafe behaviors of the company, some employees, and various auditors, and strongly condemned these unsafe behaviors. In addition, the Brazilian National Mining Agency (ANM) is responsible for the supervision of mining activities in Brazil, from mining authorization to company supervision, including mineral exploration tax and tailings industrial system inspection. Unfortunately, the ANM has a poor management structure, with insufficient staff, lack of training, and low skill levels because of the annual budget cuts made by successive governments. A week before the catastrophic failure, the structural foundation of the dam was suspected of leaking, and the dam staff failed to repair the problem. This reflects the poor oversight practices of ANM. [29].

Most of the victims of the Córrego do Feijão tailings dam came from the Vale company itself, to which it belongs, because of the concentration of mining facilities—offices, cafeterias, workshops, and processing plants located downstream, close to the tailings dam. At least 270 people died in the collapse, most of them are employees of the mine. The mudslide destroyed parts of the Córrego do Feijão area, including a nearby hotel and several rural properties, as well as a section of a railroad bridge and about 100 meters of track [29].

According to the above analysis, the relevant personnel of the TIS in Brumadinho, Brazil, have weak awareness of risk situation, improper attitude and motivation, low level of knowledge and skills, and lack of safety barrier to prevent failure. Therefore, it can be estimated that the probability of failure is very high, which is grade F0. According to the two-dimensional risk matrix, the severity of the dam break of the TIS is determined as grade C4. Finally, a framework for prevention and control of TIS failures in Brazil is constructed and prevention and control measures are given, as shown in Fig. 11.

Fig. 11
figure 11

Failure prevention framework for TIS in Brumadinho, Brazil

Tailings dam practice should be carried out by qualified engineers with expertise and experience in tailings dam management. Based on various investigations and analyses of TIS failures in Brazil, the corresponding preventive measures are given as follows:

  1. 1.

    Design

Failure to characterize the dam foundation conditions” corresponding preventive measures:

  • Adequate geological and geotechnical investigations. (Define rock types, weathering, geological structures, and orientations. Define ground water levels and flows, and hydrogeological parameters. Characterize strata and geotechnical parameters (shear strength, compressibility, and permeability).

  • Hydrogeological and geotechnical analyses, Tailings dam geotechnical stability analyses, including seismic assessment of the dam and the tailings. Assessment of stage construction of the tailings dam.

  • Maintain adequate documentation during construction. Design drawings. As-built documentation, noting deviations from the design and the reasons for this.

Inappropriate selection of impoundment type” corresponding preventive measures:

  • Investigate the surrounding topography, including catchment sizes.

  • Investigate potential seepage pathways (e.g., faults, paleochannels, connections to aquifers), which could lead to piping failure of the dam wall.

Low qualification of designers” corresponding preventive measures:

  • Design by qualified and experienced experts as required

Inappropriate choice of construction methods” corresponding preventive measures:

  • Consider site and tailings constraints (Topography, Available footprint, Availability of suitable borrow materials, Seismic risk, Rate of rise of tailings, and Nature of the tailings)

Failure to address spillway requirements” corresponding preventive measures:

  • Characterize the extreme rainfall events

  • Determine the potential for contaminated runoff

  • Where excess water can be discharged to the environment during operations, provide a spillway with sufficient capacity to handle the volume of water involved, to prevent overtopping of the dam wall.

  • Assess post-closure spillway requirements (The permanent spillway should preferably be constructed through natural rock, rather than over the wall, consider long-term erodibility of the spillway, provide a settling pond for the retention of suspended solids in any runoff from the tailings, Maintain spillway capacity)

  1. 2.

    Construction

Construction not able to accommodate the tailings” corresponding preventive measures:

  • Improve forward planning for tailings dam wall raises to maintain adequate freeboard (Ideally prior to the start of each wet season and to anticipate planned increases in tailing production)

  • Increase the rate of construction of the tailings dam wall raises

Not constructing according to design” corresponding preventive measures:

  • Confirm the selection and use of appropriate borrow materials

  • Maintain adequate construction quality assurance and quality control (QA/QC) during the construction of the wall or raise.

  1. 3.

    Operation

Poor maintenance of dam walls” corresponding preventive measures:

  • Daily inspections of dam walls (Carry out any maintenance required, undertake detailed inspections following extreme rainfall events)

  • Conduct regular reviews and audits of the TIS

  • Maintain water management in the event of temporary or permanent decommissioning of the tailings facility

Poor control of tailings deposition” corresponding preventive measures:

  • Cycle tailings deposition to allow drainage, consolidation, and desiccation, in order to maximize the tailings density achieved and hence minimize the storage volume required

  1. 4.

    Post-closure

Discovering potential dam hazards” corresponding preventive measures:

  • Implement tailings storage facility wall failure provisions in emergency response plans

Failure to deal with Dam hazards in time” corresponding preventive measures:

  • Implement trigger action response plans (TARPs) in response to daily and routine inspection findings and audits of the tailings storage facility wall (e.g., excessive or increasing wall seepage, vegetation, cracking and deformation, and bulging).

Discussion: Shortages of Current Work and Direction for Future Studies

Conceptualizing the risk of TIS is a difficult task because of the relationship between the factors of the TIS and the complexity of the local environment. In principle, TIS failure analysis should alleviate the difficulties associated with identifying the risk level of TIS by providing comprehensive information.

However, the risk profile of TIS is dynamic, which means that even the most extensive and comprehensive TIS failure analysis will not be sufficient in the next few years. In this industry, the risk characterization of TIS is a complex proposition in the life cycle. Unlike other industrial systems, with the exponential accumulation of waste in the mine life cycle, TIS has a significant tendency to expand over time.

The case study of the Brumadinho tailings dam in Brazil demonstrates the practical application of the failure prevention framework. Unfortunately, the failure prevention framework is based on the post-accident in Brumadinho, Brazil, so it is difficult to validate the accuracy of the risk analysis or prevention and control measures before the accident. Likewise, the relationship between TIS failure analysis and the prevention framework warrants further exploration.

Official feedback (from Special Achievement Acceptance Review Meeting held on April 14, 2021 in Beijing) has confirmed that the failure analysis and prevention framework for TIS is consistent with the views of relevant experts and provides a great deal of information that is useful for risk prevention and control purposes. However, further evidence is needed to validate the failure prevention framework. And more detailed information is required to improve the practicability of the failure analysis, especially about risk likelihood. In practice, practicability could be assessed by considering the effectiveness of the failure prevention framework applied to several different TIS—this is outside the scope of the present paper but should be pursued in future. It is therefore recommended that future research should use dynamic fault tree to consider the influence of dynamic changes to TIS as it continues undergoing rapid growth. This could include scenario-based predictive risk evaluation and prevention.

Conclusion

This study presents a systematic analysis and integrated characterization of TIS failures and proposes a prevention framework that integrates failure potential prevention, control, and abatement mechanisms. Three main points were derived as follows:

  1. 1.

    Based on the whole life cycle analysis of TIS, the 32 basic events of dam collapse, 29 risk evolution pathways and 8 comprehensive measures of failure prevention and control have been derived. There are 37 basic events of the overtopping, 34 risk evolution pathways, and 8 comprehensive measures for failure prevention and control. There are 36 basic events of seepage, 33 risk evolution pathways, and 8 comprehensive measures for failure prevention and control. There are 10 basic events of transmission leakage, 9 risk evolution pathways, and 2 comprehensive measures for fault prevention and control. There are 19 basic events of dust emission, 17 risk evolution pathways, and 4 comprehensive measures for failure prevention and control.

  2. 2.

    Combining the frequency of basic events in TIS failures and the reliability of safety functions, the probability of failure was characterized, and a calculation method to quantify the probability of occurrence of critical events for both was given. Selecting the frequency and consequences of TIS failures with the help of a risk matrix, thus identifying reference failure scenarios.

  3. 3.

    Using the example of the TIS failure in Brazil, a multi-level framework of risk prevention measures for different life cycle stages was constructed. The framework can make contributions to the disaster reduction and prevention of the TIS.