Keywords

12.1 Introduction

Global digitization can improve reliability and reduce costs, so maintenance managers need to be more ambitious in their move toward digital maintenance [1]. With the digitization of maintenance operations and reliability in heavy industries, it is expected that the availability of company assets will increase by 5–15% and their repair and maintenance costs will also decrease by 18–25%.

The decision-making processes that support maintenance and reliability operations may be sped up and standardized with the aid of new digital technologies. For instance, reliability teams may plan and manage repair or replacement decisions throughout the lifecycles of individual assets or whole fleets with the use of digital asset management systems. On the other hand, new digital technologies can assist teams in selecting the best maintenance strategy (e.g., run-to-fail, scheduled preventative maintenance, or condition-based maintenance) for each equipment, as well as they can promote reliability-centered maintenance [2].

With the advent of the Internet and the widespread use of information technology, manufacturing industry has been impacted by digital information technology. As the digital, physical, and human worlds increasingly integrate, the industry undergoes deep transformation, and emergence of the Fourth Industrial Revolution called Industry 4.0. This technology offers opportunities for factories to be used as open platforms and distributed systems, where they can operate faster, more efficiently, and with a more flexible and resilient supply chain [3].

Based on the change in the manufacturing environments and the increasing competition among companies, we need a new concept to define and build manufacturing factories. This is because the future industrial factory must work as a flexible, resilient, and affordable system. To illustrate this new concept more clearly, the past Industrial Revolutions are examined in this section [4].

First Industrial Revolution used steam power to cause major changes in industries in the eighteenth century. Second Industrial Revolution was made possible by electric power and assembly lines. During the Third Industrial Revolution, computers and information technology became integral parts of manufacturing as well as computer-aided systems. A major feature of the Fourth Industrial Revolution is the strong use of automation and data exchange in manufacturing. Cyber-physical systems, the Internet of Things (IoT), 3D printing, digital twinge, advanced analytics, and cloud computing, among others are used in the new systems [5]. An illustration of the transformation takes place is shown in Fig. 12.1.

Fig. 12.1
figure 1

Overview of industrial revolutions over time [7]

In order to fully utilize digitalization, cyber-physical system integration and intelligent control, industrial systems must process digitalization and implement cyber-physical system integration. In addition to efficiency increases, supporting systems need to be integrated into the main system, such as maintenance, logistics, and supply chain. We deal with a smart system that consists of many systems with dynamic structure. By changing the manufacturing environment, the system speed and flexibility increase. Therefore, smart manufacturing and Industry 4.0 investment have been increasing rapidly and several countries have focused on this subject. A variety of research methods have also been used to introduce and analyze Industry 4.0 and smart manufacturing systems as well [6].

In order to implement Industry 4.0, several fundamental requirements must be met [8]:

  • Integrated enterprise systems and interoperability

  • An organization that is distributed

  • A model-based approach to monitoring and controlling

  • Environments and systems that are heterogeneous

  • A dynamic and open structure

  • Teamwork and collaboration

  • Human-to-machine integration and interoperability

  • The ability to scale, be agile, and be fault-tolerant

  • A network of interdependence

  • Collaborative manufacturing platforms that are service-oriented

  • Decision support systems based on data-driven analysis, modeling, control, and learning.

Additionally, different types of technological innovations should be implemented to establish a smart factory [9,10,11]. There are several technologies involved, such as software, advanced collaborative robotics, configurations that are modular and adaptable, high-speed data transfer systems, and others. As a prerequisite to a fully smart system, we need a smart supply chain, smart maintenance system, and smart labor. From a technical standpoint, this type of system presents a number of challenges. This has prevented some companies from implementing this idea and there is still a long way to go.

It is well known that manufacturing systems must be reliable and readily available. Design, implementation, and utilization processes should include considerations for security, safety, and maintainability. Therefore, when a smart factory idea is investigated, these challenges and opportunities must be considered from a reliability engineering point of view. The rest of the chapter explores smart reliability analysis and smart safety management based on big data, Internet of Things, cyber-physical system, and so on.

12.2 Reliability Analysis

12.2.1 Big Data and Data Processing

Intelligent systems incorporate advanced instruments and facilities to collect and analyze data at different phases in the life cycle of a product, such as raw materials, machine operations, facility logistics, quality control, product use, and warranty duration. This data plays a crucial role in smart systems, and big data empowers companies to develop more flexible and effective strategies to compete on the market. It is imperative to store and analyze the data collected from manufacturing systems. As industrial development progressed and technology was integrated with manufacturing, as well as the use of computerized systems, data is collected and stored on a machine. In recent years, the capabilities of information technology have rapidly grown up and advanced technologies (e.g., big data analytics, Internet of Things, cloud computing, and artificial intelligence) are becoming more prevalent in industrial and business systems. By integrating IT with systems, a new paradigm is created called Industry 4.0. A similar pattern of data evolution can probably also be assumed for other systems; Fig. 12.2 illustrates how data evolved in manufacturing systems.

Fig. 12.2
figure 2

History of data volume variety and complexity in manufacturing systems [12]

The big data collected must be processed and applied in order for system performance to improve. Different types of parameters with different quality and forms are contained in this data due to the use of different sensors and sources. Various types of data may be collected, including video, voice, electronic signal, image, and others, and these should be preprocessed, processed, and analyzed before they can be applied. Data from crude sources is not valuable and may also contain noisy data, therefore, the data should be converted into specific information content and context that users can directly understand. In order to achieve this, we need advanced methods such as cloud computing, neural networks, and deep learning.

In recent decades, cloud-based big data processing technology has been studied as an interesting topic, and various computer models are planned based on different platforms and focuses, such as stream-based, batch-based, directed acyclic graphs-based, graph-based, interactive and visual processing.

Neural networks are a powerful tool in reliability engineering, particularly for predicting how long equipment will be usable. Although artificial neural networks (ANN) are beneficial for data processing, deep learning is more effective. Reduced operating expenses, improved productivity and reduced downtime, keeping up with changing customer demand, improved visibility, and extracting more value from operations for worldwide competitiveness are all points of interest in deep learning.

As computing techniques and data processing have advanced, computer-aided engineering systems and design methods have improved, for instance, different kinds of failure in the system are now modeled and evaluated by simulations. In the utilization stage, this capability provides a greater understanding of failure mechanisms and how to avoid them. Using these capabilities, a reliability engineer can optimize the predictability of a new product in the design phase. Conversely, designers apply artificial intelligence (AI) and deep learning to their own design processes and to new products. It will be a challenge for engineers to use these tools in their own designs in order to optimize final designs more quickly [13]. The dynamic behavior of the system is another challenge in big data processing. System modeling requires the use of a model that adapts to the age, degradation behavior, and condition of the system. Because the system can be influenced by the data collected in real time, the pre-defined model could be changed. The topic of model updating is therefore an interesting one in this field [3].

12.2.2 Internet of Things

IoT in maintenance program can help increase safety, reliability, efficiency, connectivity, and communication [14, 15]. Figure 12.3 depicts the increase growth of IoT devices from 2015 to 2025. The production capacity of a manufacturing plant is reduced during equipment breakdowns. IoT-based predictive maintenance could:

Fig. 12.3
figure 3

IoT devices growth during 2015–2025 [17]

  • Increase the reliability and availability of equipment and machines;

  • Reduce costs;

  • Improve uptime;

  • Reduce the risks of safety, health, environment, and quality; and

  • Extend the lifetime of an aging asset [16].

By identifying a fault before it occurs, IoT predictive maintenance allows machines to be maintained in advance. A machine's condition can be monitored in real time by Internet of Things maintenance systems. The data is analyzed by software to create performance reports. The architecture of IoT-based predictive maintenance is illustrated in Fig. 12.4.

Fig. 12.4
figure 4

IoT-based predictive maintenance architecture

Identifying key factors that determine equipment's health is necessary before proceeding to technical details. As soon as these variables are determined, equipment is outfitted with sensors to collect information about them and send it to the cloud for processing. Gateways are required to transfer sensor data to the cloud—it cannot pass directly. In field gateways, the data is filtered and preprocessed. Connecting various gateways via various protocols is possible with a cloud gateway, which enables data transmission and ensures secure data transmission. Streaming data processors then receive the sensor data that was entered into the cloud part. Data lakes are used to store data streams and to transmit them quickly and efficiently to data storage, enabling continuous flow of data. The data collected by sensors is stored in a data lake. Currently, the data is raw, so it may contain inaccurate or erroneous information. It is displayed as a collection of measurements taken at the corresponding time by a number of sets of sensor. In order to gain insight into the health of the equipment, the data is loaded into a big data warehouse. It contains vibration, temperature, and other parameters measured at a corresponding time and contextual information about equipment’ locations, types, dates, etc.

Machine learning (ML) algorithms are used to analyze the data after it has been prepared. Machine learning algorithms are used to detect abnormal patterns in datasets and reveal hidden correlations. Predictive models take into account the patterns in the data. Predictive models are built, trained, and then applied to diagnose whether a fault occurs in an equipment, identify the weak points of equipment, or predict equipment’ remaining useful life. Predictive models which are used for predictive equipment maintenance may follow two approaches:

  • Regression approach: These models indicate how many days/cycles remain before an equipment will reach the end of its useful life.

  • Classification approach: Using this approach, we can predict whether equipment is likely to fault and determines whether their properties are lower than usual.

The update of predictive models usually occurs once a month, and then they are tested for accuracy. If the result does not match the expected one, it is changed, retrained, and tested again until it works properly. A significant amount of exploratory analytics should be performed before moving on to machine learning. In machine learning datasets, data analysis is used to detect relationships, trends, and insights. Furthermore, several technological assumptions are evaluated during the exploratory analytics stage to aid in the selection of the best-fit machine learning algorithm. An IoT-based predictive maintenance system can inform users of a likely equipment failure using user apps.

For instance, Fig. 12.5 illustrates the implementation of IoT-based predictive maintenance in a production line. Sometimes, physical inspections of production line equipment require personnel to enter dangerous environments to inspect the facilities, which may not be possible. Factories may use IoT-based predictive maintenance to anticipate possible breakdowns and boost the productivity of highly essential equipment. The solution measures temperature, vibration levels, and the other equipment's properties, with sensors deployed throughout the equipment. The system collects real-time sensor data and sends it to the cloud for analysis, prediction, and assessment [18].

Fig. 12.5
figure 5

IoT-based predictive maintenance in a production line

12.2.3 Cyber-Physical System

Cyber-physical systems (CPS) are intelligent systems that include engineered networks with the ability to interact with physical and computational components (based on algorithms). These systems are highly interconnected and integrated, providing new functions to improve and enhance the quality of life and leading to technological advances in critical areas such as personal health care, emergency response, traffic flow management, smart manufacturing, national security and defense, and produce and consume energy. Currently, in addition to CPS, there are many other words and phrases that describe similar or related systems and concepts, such as Industrial Internet, Internet of Things (IoT), Machine-to-Machine (M2M), smart cities, and so on. There is a lot of overlap between these concepts, especially between CPS and IoT, as they are sometimes used interchangeably (Fig. 12.6). In 2013, the International Telecommunication Union (ITU) defined the Internet of Things in a recommendation as follows:

A global social information infrastructure created by the interconnection (physical and virtual) of objects, based on existing and evolving information and communication technologies, with the ability to work with each other and enable the provision of advanced services.

Fig. 12.6
figure 6

Internet of Things and cyber-physical systems

The true value of the Internet of Things is determined when the data generated by sensors, devices, machines, and terminals of the Internet of Things can be received, interpreted, and processed through predicted systems, and finally, the necessary commands given to the appropriate operators. In other words, the true value of the Internet of Things for manufacturers lies in the analysis that results from the cyber-physical models of machines and systems. In the fourth generation industry, the systems that can add value to the Internet of Things are cyber-physical systems (CPS). Objects in the IoT include physical world objects (physical assets) and virtual world objects, i.e., information. When the IoT is integrated with sensors and actuators, the resulting technology becomes an example of more general systems, such as cyber-physical systems, which include technologies such as smart grids, smart homes, smart transportation, and smart cities. The cyber-physical system is an interface between the human world and the cyber sphere, enabling the data collected by the system to be transformed into operational information and, ultimately, to optimize processes by interacting with physical assets.

When data is collected from physical assets using sensors embedded in them using IoT technology, large volumes of data are generated and made available. Unfortunately, existing technologies are not enough to categorize and manage this huge amount of data that is generated daily. In addition, analytical methods and algorithms are not mature enough to use this large amount of data and have not grown enough to be able to intelligently and efficiently process and analyze all generated data. This is considered as a big data challenge.

From the similarity of IoT and CPS in their use of network, Internet, and sensors, it can be concluded that they are different definitions of a common concept. Despite this similarity, IoT and CPS are not the same thing. The conversion of data into information or tasks has placed a special emphasis on fault detection and prediction. For instance, the use of nonlinear data analysis methods in robotic applications and the application of multiple baselines to achieve a health machine model that analyzes data related to vibration, temperature, and torque and diagnoses the faults of their axis.

To meet the needs of the cyber surface, it is necessary to use historical records and algorithms that are learned over time to obtain reliable information of the health and estimated life of machines. As the machine has several decreases in performance, the development of health monitoring algorithms based on historical data is important. Although analytical methods for practical applications in industry are complex, life prediction methods need to respond to changes in operating conditions and the impact of maintenance operations on life estimation. The cyber provides more reliable information about the health status of the machines compared to the information obtained from the traditional method of condition monitoring. In the traditional condition monitoring, the condition of the machine is compared to the condition at start-up or the ideal condition, which is called the “baseline”, and the health status of the machine is determined by their differences and the trend of changes.

As a perception-level cascading system, it must include decision-making algorithms and support systems that are able to suggest appropriate maintenance and production measures through the use of condition-based maintenance and predictive maintenance in the form of CPS based on the health of monitored machines and their reliability value. Currently, there is no mature and fully integrated system that combines machine health with decision-making processes in a way that reflects the true values of machine health. Therefore, for many industries, achieving the level of perception is a major challenge. For example, according to studies based on “alternatives theory [19]” and by estimating the remaining life of a physical asset (which is the output of the health monitoring system), the appropriate time for the maintenance and repair operations can be decided. Alternatives theory is an idea that has been used for many years to buy and sell a fixed asset item at the end of its useful life or before. At this time, the amount of information that needs to be processed is so large and beyond the capacity of human decision makers that it is necessary to first provide decision-making systems with various options to operational staff, engineers, or maintenance staff in order for them to make the final decision. The studies showed that current technologies in practice cannot adequately give machines the ability to self-adjust or self-configure, and there are many research opportunities for the development of this aspect of CPS. For instance, although much work has been done to control vibrations and unbalance of machines, to neutralize the effect of chatter on rolling racks, or to control machine tools, there is a long way to go before automatic rotating machines can be configured. Nevertheless, knowing the capabilities of cyber-physical systems allows for the development of a promising design approach for CPS-based maintenance applications. Interconnectivity, which was covered in the previous section, gives access to a wealth of data. However, having access to data alone does not offer a major benefit. Therefore, managing, classifying, and processing data so that PHM algorithms may further analyze it requires a robust and flexible technique. This approach has to be comprehensive enough to fully take use of cyber-physical systems’ benefits.

Lee and Bagheri [20] suggested the “Time Machine Methodology for Cyber-Physical Systems” that is a methodical approach and being used to deploy CPS in maintenance applications. This strategy is in charge of correctly arranging the data that is already accessible in a big data environment so that it is ready for use in PHM algorithms and that every single asset in the fleet has a time machine record, which represents a type of digital. This cyber twin’s approach is to gather and clean up data in preparation for future use. Other information that is taken from the cyber side includes sensory data as well as installation history, operating parameters, system configuration, maintenance events, and others. The stability of the cyber model over time is its most significant benefit. The actual asset will eventually collapse, but its digital duplicate will continue to maintain its data indefinitely. The schematic representation of CPS-based maintenance strategy is depicted in Fig. 12.7.

Fig. 12.7
figure 7

CPS-based maintenance strategy [20]

12.3 Assessment of Safety Risks

An assessment and management of risk is focused on identifying assets, analyzing vulnerabilities, and evaluating and estimating damages that could occur. Generally, risk assessment can be roughly divided into qualitative and quantitative aspects. Quantitative assessment is based heavily on expert experience, while qualitative entails calculating the exact risk value of the system. There have been many methods of assessing safety risk to date; below are some typical technologies for safety.

12.3.1 Big Data

The big data mainly contains the five aspects in detail which include basic theories of safety big data, big data-driven safety management, big data-driven risk assessment and forecasting, big data application platform and design scheme in safety management, and big data-related technology developments in safety management [21]. The application of big data in the field of safety science precedes its theoretical studies without a doubt [22].

12.3.2 Cyber-Physical System

Cyber-physical system places great importance on risk assessment and management. When CPS was first developed, system designers gave more consideration to safety [23]. As a result of interactions between the environment and the control system, the control system itself, and the control system and authorized users, safety risks may occur. The CIA triad, which is commonly known as the three basic security objectives (confidentiality, integrity, and availability) in CPS and IT systems, represents the fundamental security objectives [23,24,25]. In contrast with traditional IT systems, CPS places the highest priority on availability. According to Table 12.1, these fundamental objectives are important for both CPS and IT systems, but their priorities are different [26, 27]. The goal of availability and safety is to keep the system under a pre-defined and acceptable threshold [28].

Table 12.1 Security objectives in CPS versus IT systems in order of priority

CPS safety risk assessment methods have been developed in many ways, some examples are Fault tree analysis (FTA), Failure modes and effects analysis (FMEA), Hazard and operability methodology (HAZOP), Model-based engineering (MBE), Goal tree-success tree and master logic diagram (GTST-MLD), system theoretic process analysis (STPA), and Temporary Structures Monitoring [29].

From the foregoing, it appears that the application of digital tools will help to:

  • Accurately calculate reliability due to online condition monitoring;

  • Improved productivity of staff and reduced human labor;

  • Efficient maintenance management;

  • Better use of equipment and assets;

  • Cost-effective operation;

  • Improved work safety and reduce risk;

  • Reduce the machine stoppage; and

  • Reduce the costs related to major repairs.

12.4 Conclusion

In the fourth generation of industry, big data, the Internet of Things, cyber-physical system, and quick response to change provide an opportunity for reliability engineering to improve system reliability. Additionally, complexity increases, interconnections and dependencies between components, dynamic behavior, and advanced components, such as CPSs and sensors, make reliability engineering challenges for designers. It is necessary to update traditional methods and to develop new frameworks for reliability, risk, safety, and security.

Besides that, by using IoT-based predictive maintenance, equipment life can be extended by 30%, time-based maintenance can be eliminated, and equipment downtime decreased by 50%. However, a well-thought-out architecture with an emphasis on machine learning is required for a mature and dependable predictive maintenance system.

In this chapter, the application of new methods and tools such as big data and data processing, IoT, and cyber-physical system was described to analyze the reliability and risk of equipment. For future research, it seems necessary that the advantages and benefit–cost analysis of digital tools are compared to traditional tools and methods.

Our suggestion is that managers don’t limit themselves to using a specific mode of digital tools, but think about how advanced digital analytics techniques can transform their maintenance and reliability system. This means constantly looking for opportunities to improve the use of data and user-centered design principles, in order to digitize processes. Sustained efficiency requires a combination of new digital tools, changes in asset strategy, and improves reliability performance.