Keywords

1 Introduction and State of the Art

In order to advance the competitiveness of manufacturing companies, their productivity must be increased, for which smart maintenance strategies are required. Predictive Maintenance (PdM) enables an increased availability, service life and performance of plant and equipment as well as an optimised product and process quality. Through a comprehensive analysis of production systems, the Remaining Useful Life (RUL) of machine components can be estimated and thus the appropriate point in time for maintenance can be determined [1]. This leads to a higher Overall Equipment Effectiveness and thus to an increased productivity. While reactive maintenance only provides for repairs after a breakdown, preventive maintenance aims at replacing components at predefined time intervals based on manufacturer specifications and empirical values [2]. Condition-based maintenance includes continuous condition monitoring and the resolution of maintenance measures based on defined condition deviations and events. Currently, most companies rely on reactive or preventive maintenance strategies, as costs and complexity of more proactive maintenance strategies initially discourage. Especially PdM poses challenges for many companies, which is confirmed in a study from 2017 [3].

Existing approaches in literature mostly focus on applying models, methods and technologies on specific use cases [4, 5] or highlight the most common techniques in the context of PdM, e.g. Long Short-Term Memory (LSTM)—according to Wu et al. [6] commonly used due to its efficiency in processing time series data—and linear regression [7, 8], or mechanisms for feature extraction as an important pre-processing step for training the models [9]. Furthermore, multiple literature deals with hybrid modelling approaches, e.g. combining various machine learning algorithms [10], data-driven and physics-based models [11, 12] or data mining and ontology-based semantics [13]. Even though scattered approaches exist regarding PdM strategies from an architecture and design to its practical implementation [14], companies still cite a lack of a systematic approach—in the sense of a clear procedure based on an Industrial Internet of Things (IIoT) architecture to derive implementation steps—as a crucial obstacle for establishing a corresponding business model and thus for implementing PdM into their manufacturing and process settings [3], which represents a research gap.

To close this research gap, this paper presents an approach for a generic IIoT architecture based on integrated models, methods and technologies with the focus on providing a basis for implementing PdM into manufacturing companies from different industries with various production systems and diverse conditions, requirements and needs. The proposed architecture combines models, methods and technologies within Industry 4.0 compliant layers such as data acquisition, transfer and persistence, analysis and business logic. By implementing PdM at PHILIPS, CDS—together with the original equipment manufacturer SACMI—and GESTAMP within the scope of the EU-funded project Z-BRE4K, the applicability of the generic architecture and the included models, methods and technologies is demonstrated. For the development of the architecture as well as for the included models, methods and technologies, established approaches are used as an orientation, but are continuously advanced and combined according to the conditions, requirements and needs of the three pilot cases. Particular attention is paid to the transparent development of the models, methods and technologies with continuous consultation with the industrial end users and their employees in order to increase acceptance towards the applications. The overall result is a holistic architecture for PdM that, subject to minor adjustments, is applicable to a wide range of industries and use cases of manufacturing companies.

The structure of the paper is as follows: First, the proposed generic architecture for PdM, highlighting its layers is presented. In Sect. 3, the application of the general architecture for the three pilot cases is described, however it only gives a short overview with exemplary excerpts of the three use cases to demonstrate the compliance of the generic architecture to different industries with various production systems and diverse conditions, requirements and needs. Finally, the paper is summed up and an outlook is given.

2 Approach for a Generic Architecture for Predictive Maintenance

Figure 1 presents an approach for a generic holistic architecture as a basis for the implementation of PdM in manufacturing and process settings subject to minor customization. Its successive layers of data acquisition, transfer and persistence and analysis, as well as the layer in which decisions are made based on generated information, the business logic, will be described in more detail in the following sections.

Fig. 1
A colorful flowchart of P d M represents data acquisition, data transfer and persistence, data analysis, and business logic with various factors.

Approach for a generic architecture for PdM

2.1 Data Acquisition

The layer of data acquisition focuses on the shop floor of a manufacturing company and includes the entire machinery and the measuring equipment, in particular sensors. Furthermore, this layer is source of expert knowledge about the machines and the production process, including engineering data, e.g. material specifications or data sheets of different components of the production systems. Measuring principles and sensor types must be selected or retrofitted depending on the required signals. This selection process can be supported using existing toolboxes, e.g. according to Fleischer et al. [15]. Furthermore, the sensors must be installed and mechanically integrated at an appropriate location. The sensor readings are transmitted to gateways for pre-processing of the signals using appropriate communication technology, e.g. bus systems.

2.2 Data Transfer and Persistence

The acquired sensor data is processed via gateways with functionalities such as interoperability, aggregation and—if required—local data pre-processing. Interoperability simplifies the connection of multiple machines and other devices, using a variety of interfaces, protocols and standards for both local installation (edge) and remote data transmission (connectivity). An ontology-based semantic framework can be applied to foster interoperability, i.e. to maintain consistency across different data sources and sinks. It involves semantic modelling of maintenance entities (e.g. production systems, work parts, failure modes) and actions, which then serve as a reference model that allows linking all data in the respective use case and providing standards-based access to all data related to a particular application. By brokering data coming from a wide variety of publishing sources, the data is centralized and contextualized, which then can be subscribed by data analysing consumers. In order to ensure the data security and sovereignty of the transferred and persistent data, Industrial Data Space (IDS) aims to formulate an infrastructure for the fast, secure and sovereign transmission and use of data. Only certified participants whose identity has been verified beforehand are allowed to enter the data space. IDS connectors based on FIWARE specification use system adapters to receive data from various sources, e.g. from a Message Queuing Telemetry Transport (MQTT) broker built on top of the gateway, and convert them into a standardized format. Consumers are able to subscribe contextualized data from an integrated Orion Context Broker (OCB) in a standardized and secure manner via Next Generation Service Interfaces (NGSI), which is a protocol developed by Open Mobile Alliance (OMA) to manage context information [16, 17].

2.3 Data Analysis and Business Logic

The layer of data analysis aims to detect and predict faults and estimate the RUL of critical machine components. Considering a deterioration profile of a machine and its components, the RUL is the period from a current health status of the machine to a detected fault based on computed condition indicators. Fault detection includes the distinguishing between a healthy and a faulty status of a machine. Various reactive algorithms can be applied to detect anomalies within the data, tailored to the requirements of the respective use case. Some sensor readings do not show a significant changing trend between healthy states and faults and therefore do not contribute to the selection of useful features for training a model. Consequently, data reduction needs to be performed by selecting only sensor signals with the strongest trend and combining them to calculate condition indicators by means of feature extraction mechanisms.

There are three common approaches to predict faults and to estimate the RUL, depending on the data available [18]. First, if no data is available on the history of the running machine, but data about breakdowns as single events, then survival models, i.e. probability density functions, can be used. This approach does not involve data collection of machine behaviour prior to a failure. The second approach for RUL estimation is once complete run-to-failure data (data from healthy state, degradation and actual failure) of the machine in operation or similar machines are available, then similarity models should be used. For similarity models, a lot of training data is needed so that the machine in operation can be aligned with these historically most similar degradation profiles. In cases where failures cannot be detected and therefore no failure data is available to train a predictive model, (safety) thresholds must be defined for condition indicators that should not be crossed. In that case, a fed degradation model is useful, as the behaviour of the running machine is monitored and by identifying a pattern within the data, a trend can be extrapolated until the threshold is reached, e.g. via regression analysis. Another approach in cases where failures could not be detected due to missing machine failures is to simulate potential failures using physics-based models of the machine and its components. By the hybridization of data-driven and physics-based models, simulations can be carried out quickly and cost-effectively to expand the amount of data required. In addition, using physics-based models, physical causes and effects can be investigated to gain additional knowledge, potential failure mechanisms can be identified and faults can be localized more easily.

Regarding the business logic, a decision support system (DSS) evaluates the performance of the machines and receives anomalies, predictions and RUL estimations published by the data-driven models, combining them into a single result by automatically tuning and promoting the most effective predictions approach. The DSS decides on preventive actions by activating recommendations to improve maintainability and operational efficiency on the shop floor. A feedback loop by the operator can be applied, which takes into account their opinions about the recommendations and their quality. The Failure Mode and Effects Analysis (FMEA) identifies the potential failure modes, causes and effects associated with a machine component and how the performance of the system is affected, addressing each failure mode and its respective effects in the system independently. FMECA additionally analyses the criticalities of each potential failure mode and effect and calculates the Risk Priority Number (RPN) based on the combination of occurrence and severity of a given combination of failure mode, cause and effect. Based on that, a heat map provides information to the user in an intuitive manner. The user can immediately see whether a particular failure mode causes a high, medium or low risk to the system based on the area placed in the heat map.

3 Application of the Generic Architecture in Pilot Cases

The general architecture proposed in Sect. 2 is applied in three pilot cases with industrial end users to proof the compatibility of the architecture to different industries with various production systems and diverse conditions, requirements and needs. One pilot case deals with cold forming for the manufacturing of cutting elements for electric shavers at the PHILIPS factory. The second one is about continuous compression moulding for the manufacturing of plastic closures at CDS and the third one involves cold forming, arc-welding and in-line quality inspection for the manufacturing of chassis parts at GESTAMP.

3.1 Cold Forming for the Manufacturing of Shaver Cutters

Shaving systems are manufactured at the PHILIPS factory. One of the parts of a shaving system is a cutter. These cutters are produced on a production line consisting of cold forming, finishing, measuring and assembly. Figure 2 presents the implementation of the generic architecture to this pilot case.

Fig. 2
A flowchart of P d M at PHILIPS represents data acquisition, data transfer and persistence, data analysis, and business logic with various factors.

Adapted architecture for PdM at PHILIPS

Data acquisition. This pilot case focuses on dies of a press that form the cutters from a metal strip entering the press. Multiple acoustic emission (AE) sensors are installed in crucial positions that can measure signals in terms of changes in the hardened steel.

Data transfer and persistence. The AE data along with quality inspection information are collected in real-time by the measuring equipment coupled to the factory network. After pre-processing, a gateway sends the data to a database and publishes them to a MQTT broker extended with an IDS connector. Via the counterpart of the IDS connector provider, the IDS connector consumer, the data contextualized by an integrated OCB are transferred to the data-driven models to detect and predict faults. The semantic framework is based on an ontology, structuring the entities of the IDS connectors. The cold forming dies and all related parts, embedded data sources, failure modes, severity indicators and maintenance actions are modelled as an enumeration of related entities to represent the pilot case-specific knowledge machine-understandably.

Data analysis and business logic. By subscribing the AE data as well as other data sources, e.g. maintenance logs, in JSON format from the OCB using NGSI API, various algorithms are applied to detect and predict faults of the cold forming dies. To detect faults, i.e. AE waves that do not conform to neighbouring signals, a simple rule-based approach monitors specific sensorial inputs to detect rule violations based on user-defined thresholds. Since the threshold specification is difficult due to a lot of instability to the sensorial input, a distance-based outlier detection approach is used, which alleviates the need for rule specifications for each different sensorial input. This approach is based on the Micro-cluster Continuous Outlier Detection algorithm and can be applied to streaming data. For fault prediction an event-based algorithm is utilized, identifying patterns of events that occur before failures, in order to train the models for predictions. An approach for discretization of the input signal is then used in order to convert the sensorial signal to artificial events that might or might not have an actual meaning to the physical world. It should be noted that a number of other algorithms (e.g. Autoencoders, matrix profiles, Apriori) are used for both fault detection and prediction for validation purposes and thus to increase the reliability of calculations. Besides the aforementioned data-driven models, the physics-based model of the dies is represented as a Finite Element Method (FEM) model, analysing critical parts against high cycle fatigue to derive recommendations to strengthen the weakest parts, supporting employees during the design phase of the next generation of machine components to optimize the reliability of the machine and thereby minimize breakdowns.

In order to combine multiple fault detections or predictions, a fusion component is applied to generate higher-level detection and prediction strategies, e.g. the combination of monitoring the first and the second die of the press and only if faults are detected in both dies, a report is generated. This leads to a reduction of false positive reports. Finally, a DSS receives input from both a FMECA, mainly failure modes and severity indicators, as well from the fusion component, and uses a rule engine, which is able to decide automatically upon specific actions to mitigate the detected and predicted faults. The DSS applies business related rules that are created by the shop floor managers based on their experiences and knowledge.

3.2 Continuous Compression Moulding for the Manufacturing of Plastic Closures

Plastic closures for the food and beverage industry are manufactured at CDS. This paper particularly addresses the production of plastic closures carried out by means of compression moulding techniques. In this regard, CDS produces various formats of products with many diverse parameters such as material, dimensions or weight. Figure 3 presents the implementation of the generic architecture to this pilot case.

Fig. 3
A flowchart of P d M at C D S represents data acquisition, data transfer and persistence, data analysis, and business logic with various factors.

Adapted architecture for PdM at CDS

Data acquisition. The focus of this pilot case are five Continuous Compression Moulding (CCM) machines—designed and manufactured by SACMI—with three auxiliary modules, a hydraulic unit (HU), a plastic extruder (EX) and a thermal regulator (TH) equipped with several sensors to measure pressures, temperatures and volume flow rates. The TH provides an aqueous fluid to cool the cavities of the CCM machines and will be considered in more detail in the following paragraphs.

Data transfer and persistence. The sensor signals are collected in a gateway. To make the data available to the subsequent IIoT infrastructure, additional software is installed on the gateway. This allows publishing the data to a MQTT broker, enabling a cloud storage as intermediate layer to publish data to an integrated backend, generating a first data visualization to provide early information to the operator about the status of the machine. All data is transformed in Resource Description Framework (RDF) triplets and stored in a triple store database. On top of the database, a reasoner supports the extraction of knowledge according the consumer’s queries, building a knowledge-based system. The cloud storage is extended with an IDS connector to ensure data security and sovereignty, following the same policy as in Sect. 3.1.

Data analysis and business logic. By subscribing the sensor data in JSON format from an OCB using NGSI API, various algorithms are applied to detect faults and to estimate the RUL. For instance, a regression-based approach is used to detect faults of the TH—which are mainly caused by a clogging filter—based on the trend of incoming sensor data. The approach uses historical data and detects violations based on predefined thresholds that are either linear or exponential to the measured trend. By applying an event clustering algorithm, the extrapolated trends of the individual sensor values based on the regression analysis are weighted up to the thresholds, so that an overall RUL of the filter is calculated. Due to missing machine breakdowns and thus a lack of detected faults, a physics-based representation of the TH as an object- and signal flow-oriented modelling approach with integrated virtual sensors is applied for multi-physics simulations of the TH. The hybridization of the regressive event tracker model and the physics-based model is carried out by varying various parameters in the physics-based simulation model, automatically integrating virtual sensor values into the regressive event tracker model, while at the same time changing parameters during operation are integrated into the physics-based model.

Reports based on a fusion algorithm are triggered when pre-defined combinations of outcomes of the models are reached. Besides the rule-based processing of these outcomes, the DSS requests information from a FMECA tool, in particular the level of criticality for failure modes and checks the combination of the confidence level along with the criticality and the timeframe of the predictions to activate maintenance actions, e.g. as a notification to a mobile device of the operator. Furthermore, a Computerized Maintenance Management System (CMMS) is implemented, so the operator can schedule maintenance actions and to interface with other entities such as the spare part warehouse inventory for an availability check.

3.3 Cold Forming and Arc-Welding for the Manufacturing of Chassis Parts

Lightweight chassis parts are manufactured at the GESTAMP facility. The production line includes a stamping cell with a servo-driven press for cold forming (bending and cutting) of the incoming steel sheets, a robot for arc-welding the formed parts and an in-line multi-sensor quality control system to ensure the quality of the finished manufactured parts. Figure 4 presents the implementation of the generic architecture to this pilot case.

Fig. 4
A flowchart of the architecture of P d M at G E S T A M P depicts data acquisition, data transfer and persistence, data analysis, and business logic with many factors.

Adapted architecture for PdM at GESTAMP

Data acquisition. The stamping press is a closed system equipped with several pre-installed sensors for measuring torque, temperature, pressure and lubrication and two strain gage sensors installed at two connecting rods of the press. These locations are selected to provide the most significant strain history. The distribution of the load causes inertial forces that generate cyclic axial forces and stress, bending moments and stress perpendicular and parallel to the eccentric shaft. The tonnage signature provides important information that allows statements about the load, change in stock thickness and hardness, part lubrication, tool wear, stuck scrap in the die and the quality of the stamped parts. The arc-welding cell consists of two welding robots, a welding gun with a contact tip and an infrared (IR) sensor for an in-line quality control of the welding process. IR imaging provides information of the melt pool and surrounding areas during the welding process, such as geometry and temperature distribution. The IR system is comprised by a thermal camera and an embedded processing unit to perform the vision tasks in real-time and to ensure interoperability with subsequent applications.

Data transfer and persistence. All raw data acquired from each stroke of the stamping press as well as features extracted from the video sequence of the IR system together with other process parameters such as voltage and current are transferred as a XML or CSV file to a shared server. An IDS connector periodically queries this server via a system adapter. Once a new file is detected, the system adapter parses the information to NGSI. By sending the data to an integrated OCB, consumers are thus able to subscribe to a certain type of information in a standardized, secured and sovereign manner. The semantic framework is based on an ontology, following the same policy as mentioned in Sect. 3.1. Once the data is contextualized, consumers are able to analyse the data.

Data analysis and business logic. With regard to the stamping process, a perceptual metric known from quantifying image quality degradation caused by image processing, the Structural Similarity Index Measure, is used. The outliers are identified and the rate of failure determines the depreciation of the stamping dies. The instantaneous stamp force outside tolerance provides a defective part. In this regard, the defects rate provides the depreciation of the cold forming dies and therefore the RUL can be estimated. Due to only minor differences to the pilot case in Sect. 3.1 regarding the physics-based representation of the press as a FEM model, this will not be explained in more detail at this point.

Regarding the welding station, a video processing pipeline for condition monitoring and quality control combining edge and cloud processing is devised. In the edge, a feature extraction module for reducing the dimensionality of the data send to the cloud and a quality control classifier are implemented. First, a detection algorithm is applied to recognize and crop the region of interest around the contact tip during welding. Then, a bivariate classifier is combining convolutional (CNN) and recurrent neural networks (LSTM) for a quality assessment to classify parts in defect and non-defect ones. The CNN serves for the extraction of spatial features, the LSTM-based network for the extraction of temporal features and a fully connected layer classifies the spatiotemporal features extracted. In the cloud, a single LSTM model driven by the extracted spatial features is applied in terms of condition monitoring. Finally, a geometric dimensioning and tolerancing (GD&T) analysis is applied to control the quality of the finished parts, using a point cloud of scanned parts to evaluate deviations by comparing the scan to a theoretical CAD model. The fusing of all data analysis outcomes is done in the context of the DSS, which follows the same logic (including the interaction with FMECA) as described in the two pilot cases above.

4 Summary and Outlook

In terms of the EU-funded project Z-BRE4K, an architecture for PdM based on integrated models, methods and technologies has been developed with the aim of implementing PdM into manufacturing companies. The proposed architecture including various models, methods and technologies, which were continuously advanced based on existing approaches according to requirements of the use cases, has been effectively applied at three companies due to its holistic nature and modularity. Thus, a successful transformation from reactive or preventive to PdM strategies at PHILIPS, CDS and GESTAMP has been realised. Due to the transparent development of the concepts and solutions, they are applied by employees of the companies in their daily work. The applicability of the architecture has been demonstrated and can now serve as a reference for companies from different industries with various production systems and diverse conditions, requirements and needs for implementing PdM.

In follow-up activities, measurements of improved key performance indicators and a profitability analysis should be carried out, as companies want to achieve a return on investment as soon as possible. Furthermore, the integration into higher-level systems must be expanded in order to establish PdM in the entire system landscape of a company.