Keywords

1 Introduction

It is widely accepted that CBM is the most efficient maintenance strategy. The basic reasons for this are very simple. Running production without any condition monitoring or preventive maintenance actions can be extremely costly as any small fault can suddenly stop the production for a time that depends on how quickly the machine can be repaired, and this in turn might sometimes take a very long time causing lost production, lost labour and inefficient use of the whole investment. On the other hand, maintenance that is carried out in order to guarantee that no stoppages take place without measuring the need for maintenance can also be costly. A lot of maintenance might be done in vane possibly in addition to losing time and money in doing the maintenance but also possibly causing faults in the machinery, which in turn might be the cause of maintenance actions.

1.1 Internet of Things

With the advent of Internet of Things (IoT), novel strategies became feasible in industrial applications. The main difference between other concepts regarding embedded technologies (CPS, wireless sensor networks, machine to machine communication, body area networks, etc.) and the Internet of Things is that the latter considers all embedded systems as connected to the Internet. As far as functional requirements are concerned, this characteristic allows for ubiquitous access to the embedded systems. Moreover, since they make use of mature protocols and well-accepted software libraries, applications on top of IoT devices are faster to implement and easier to maintain, leading to lower time to market and cheaper maintenance.

1.2 Cyber-Physical Systems

Nowadays, conventional systems and processes are evolving into CPS in the most disparate contexts of applications (e.g. manufacturing, healthcare, automotive, white goods, logistics, etc.) and of different nature (e.g. mechanical, electrical and chemical). As stated in [1], the term “Cyber-Physical Systems” has been coined in 2006. Today, several definitions of the term CPS can be found in the literature. According to [2], CPS can be defined as transformative technologies that allow to manage interconnected systems between its physical assets and computational capabilities. The definitions in [3, 4] highlight the concept of collaboration and service provisioning. As a matter of fact, CPSs are defined as systems of collaborating computational entities that are strictly connected to the surrounding physical assets providing and using services to/from the internet. A working definition for CPS has been offered in [5], where a CPS is defined as a system consisting of computational, communication and control components combined with physical processes. Regardless to the specific definition of CPS, it is possible to identify their core elements and/or characteristics, extended from [6, 7]: (1) enhancement of physical entities with cyber capabilities; (2) networked at multiple and extreme scale; (3) dynamic behaviour (plug and unplug during operation); (4) high degrees of automation, the control loops are typically closed; (5) high degree of autonomy and collaboration to achieve a higher goal and (6) tight integration between devices, processes, machines, humans and other software applications. As deeply explained in [8], the CPS intrinsic characteristics are naturally pointing to ecosystems of interacting and connected CPS also called Cyber-Physical Systems of Systems (CPSoS) or Cyber-Physical Production Systems (CPPS) in the industrial domain. CPSoS and CPPS are promoting the design and development of advanced monitoring and control infrastructures that rely on a common virtualized space for collecting, processing, provisioning, analysing and visualizing large quantity of data [9]. This data can be potentially used for fast evaluation of the industrial assets performances to adapt and optimize (through reconfiguration) the overall behaviour of the production system while enabling the efficient and effective implementation of maintenance policies, e.g. CBM.

1.3 Challenges

The wider dissemination of CPS—and their aggregation into CPSoS—and IoT is creating new market opportunities and business models for all kind of European Industries. As a matter of fact, the new digitization and interconnection wave of products, services, processes, enterprises and people are expected to generate significant benefits for all the involved actors, assuming that the risks and challenges are properly addressed [10]. In this landscape, IoT/CPS-based platforms are increasing in their size and target applications in a steady manner. However, even if progresses are made every day supported by continuous technological advancements IoT/CPS design, development and deployment is still challenging. The great potential and enormous expectations around the IoT and CPS solutions are resulting in real challenges that the research community is asked to address to boost the progress and deployment of these solutions in real application context. The research challenges here summarized are extracted from [11,12,13] and clustered according to [2]:

  1. a.

    Science and engineering foundations: a reference architecture for interoperable and interconnected CPS-populated systems in cross-sector applications. Enable seamless human-IoT/CPS interaction;

  2. b.

    System performance, quality and acceptance: to create large, adaptive and resilient networked systems that are capable to operate in the specific environments where the physical entities are installed while delivering the required functionality in a reliable way. To develop science-based metrics for measuring system adaptability, flexibility, responsiveness, security safety and/or more in general method to predict the behaviour of highly dynamical systems and

  3. c.

    Applied development and deployment: to provide mechanisms for representing highly distributed and heterogeneous systems. To provide methodologies for virtualization of physical entities and integration of heterogeneous systems. To deliver technology foundation for building interconnected and interoperable IoT/CPS-populated systems.

The platform that accommodates CBM data needs to tackle the following issues:

  1. a.

    Provide interoperability at system level allowing the transmission of data from the CPS. How to transmit these data from the physical system and to where?

  2. b.

    Use data representation models that enable the collection of CBM information (events, root cause analysis, fault prediction and remaining useful life results) related to CPS. How to create interoperable data representation and semantics?

  3. c.

    Provide the mechanisms to process CBM data in real-time or in batch processes. How can we still maintain real-time restrictions and abide by communicational constraints?

  4. d.

    What can be the back end that processes these inbound streams in a scalable manner?

1.4 Paper Structure

Section 1 of this paper introduces CBM, IoT, CPS and describes some of the challenges related to the adaption of these technologies. In Sect. 2, Predictive and Proactive Maintenance Platform and the MANTIS approach are discussed. In addition, the role of MIMOSA is presented in Sect. 2. In Sect. 3, MANTIS Reference Architecture and Implementation the MANTIS architecture and its implementation are discussed. Section 4 summarizes the paper in the format of a short conclusion.

2 Predictive and Proactive Maintenance Platform

The overall aim of the MANTIS project [14] is to develop platform for interoperable and interconnected CPS-based systems for proactive collaborative maintenance ecosystems, i.e. for facilitating the implementation of predictive and proactive maintenance strategies. The objective of predictive maintenance is to predict when maintenance should be performed and with better planning stops and avoiding unexpected interruptions (increase plant availability) [15]. Proactive maintenance seeks to detect and eradicate failure root causes [16]. Proactive maintenance depends on rigorous machine inspection and condition monitoring. Many technologies are employed to evaluate equipment condition (infrared, acoustics, vibration analysis, electrical motor power analysis, etc.). Site measurements are often supported by wireless sensor networks, and data analysis is essential.

2.1 MANTIS Approach

The MANTIS reference architecture and platform provides a practical mean for implementing collaborative maintenance by taking advantage of:

  1. a.

    The omnipresence of intelligent devices—that combine physical entities with computational and communication capabilities—in modern processes, machines and other distinct application domains.

  2. b.

    The maturity level reached by cloud-based infrastructure and the huge amount of computational and storage resources available and usable “on-demand”.

Intelligent devices are the ones directly connected and/or installed to the physical resources and assets. They can potentially optimize and improve current maintenance activities and their related management systems by providing (often live) data—gathered during operation—that can be analysed (low-level data analysis) to understand the behaviour of the related physical resources and assets. Furthermore, the data gathered from physical resources and assets can be also combined and analysed globally (high-level data analysis) by using computational resources and complex algorithms running over the cloud (high level) to understand the collective behaviour of group of resources and assets. Therefore, within the MANTIS approach data extraction, transforming, loading and pattern analysis will take place at different levels, namely (see Fig. 1):

Fig. 1
figure 1

MANTIS overall concept idea and data processing levels [14]

  1. a.

    Low level: extraction, transforming, loading and analysis of simple signals to model and understand the behaviour of selected physical resources and assets.

  2. b.

    High level: extraction, transforming, loading and analysis of complex data results of the low level to model and understand the global behaviour of assets.

Since data sources are typically characterized by distribution, heterogeneity and a high degree of dynamicity (e.g. data sources like sensors can be plugged and unplugged any time), the design of the MANTIS architecture has been driven by the following main requirements:

  1. a.

    The provision of data structures that enable the collection of maintenance information (events, root cause analysis, fault prediction and remaining useful life results, etc.) related to systems and assets.

  2. b.

    The provision of data structures that enable large volume of data to be processed in real-time or in batch processes.

  3. c.

    Integration of complex and heterogeneous large-scale distributed systems from different application domain.

  4. d.

    The design of CPS-populated systems to enable collaborative proactive maintenance strategies.

It is easy to understand that the design of interoperable and interconnected CPS-based maintenance ecosystems becomes a key element of the MANTIS implementations to allow to dynamical and on-demand addition or removal of data sources in/from the MANTIS platform to gather most of the maintenance relevant information automatically from the environment.

2.2 Role of MIMOSA

The use and benefit of CBM strategy are based on information and the knowledge that is gained from that information. From this it follows that a lot of information has to be managed, i.e. condition monitoring measurement data, information of the use of the machine in question, data about previous maintenance actions and exact information about the components of the machinery. Naturally, all the information has to be integrated in order to make it meaningful in defining the need for maintenance and the timing of these actions. What makes the situation challenging is the fact that there are many sources and representations for this information and that these systems do not usually interoperate. In most cases, the production machinery has been designed with some computer-aided design (CAD) software, which then holds information about all the components as designed. In the next phase, some computer-aided manufacturing (CAM) system has been used in the manufacturing phase. This CAM system then holds information about the components as manufactured. Usually, a manufacturing company uses some product lifecycle manufacturing (PLM) system to follow the whole lifecycle of a product. When production machineries are in use, the whole production process is managed with an enterprise resource planning (ERP) system that holds information about all the assets of the company and also about the personnel managing these machines. The machines are normally driven using some proprietary automation system. Many companies use computerized maintenance management system (CMMS) to handle the maintenance of their production machinery. Usually, the condition monitoring (CM) systems are separate from the CMMS. Since when following CBM strategy information from all the above-named systems is needed, it is clear that interoperability and integration of information are of the highest importance. Experience from industrial practice reveals that the heterogeneity and lacking integration of the information is a considerable roadblock towards Maintenance 4.0.

There exists an organization called Operations and Maintenance Information Open System Alliance (MIMOSA). MIMOSA sees their role as an integrator between various systems and says [17] that they provide a series of interrelated information standards. According to MIMOSA, the Common Conceptual Object Model (CCOM) provides a foundation for all MIMOSA standards, while the Common Relational Information Schema (CRIS) provides a means to store enterprise O&M information. MIMOSA also manages and publishes the Open O&M Web Service Information Service Bus Model (ws-ISBM) and Common Interoperability Register (CIR) specifications, while MIMOSA has aligned with POSC Caesar Association in the development of Reference Data Libraries and with Open Applications Group Integration Specification (OAGIS) in the use of its Business Object Document (BOD) architecture to support information exchange.

MIMOSA also states that they maintain strong industry ties with other formal standardization groups [17]. For example, MIMOSA is compliant with and forms the informative reference to the published ISO 13374-1 standard for machinery diagnostic systems. MIMOSA is hosting the Oil and Gas Interoperability (OGI) Pilot which is managed by the joint MIMOSA/PCA O&M SIG and working as part of the US TAG in ISO TC 184/WG 6, which is developing the ISO OGI Technical Specification.

3 MANTIS Reference Architecture and Implementation

The purpose of the MANTIS reference architecture is to support companies interested in predictive and proactive maintenance to set up an adequate system architecture, especially with issues that are new, risky and costly to change. Since large amounts of data can be collected from industrial devices, machines or vehicles, there is a sense in trying to utilize them. Obviously, the architecture of any Maintenance 4.0 solution has to be capable of handling data in a scalable manner, and even coming from different sources, stakeholders—in different formats. Furthermore, the data to be processed and managed will change over the lifecycle of the system.

One of the first drawbacks experienced, of course, was that although data has been collected for several years now, this data collection, aggregation and storage systems were not designed to be actually used later on in a Maintenance 4.0 solution. A further issue is that often meta-data for the already collected data is lacking. This aggravates the design of data analyses considerably.

Both issues can be addressed with the usage of the OSA-CBM domain model [17]. Other interoperability and data source-related issues are also handled in the project.

At the heart of any Maintenance 4.0 solution, there are innovative analysis, prediction and planning functions that operate on the different data sets. Within MANTIS, the main analytic services fall into three categories:

  1. a.

    Remaining useful life (RUL) of components: continuous tracking of telemetry (usage) data and estimating how much time the given device or component has left before needs to be replaced

  2. b.

    Fault prediction (FP): the system shall predict based on diagnostic data an inbound failure mode (different to wear-out to be detected by RUL),

  3. c.

    Root cause analysis (RCA): when an unpredicted, complex failure occurs, the system shall deduct the actual module (cause) that caused the break.

The MANTIS reference architecture platform itself is designed to facilitate these services in a flexible and scalable manner. In here, we are relying on the following architectural decisions and patterns:

  1. a.

    The edge computing paradigm can be used to reduce the data sent to the platform and to enable on-site maintenance operations with low latency. Therefore, a respective platform within the edge needs to be provided for storage, limited analytics, HMI and lifecycle management support. In here, CPSs are involved and are executing their general production operations.

  2. b.

    The overall data flow architecture shall follow the IoT reference architecture model proposed by the Industrial Internet Consortium [18].

  3. c.

    The MANTIS platform shall integrate into the existing enterprise infrastructure in a service-oriented manner [19] to ease integration and maintainability.

  4. d.

    To enable multi-stakeholder interactions, a dynamical attachment procedure must be implemented (between edge and platform instances) so that many interested partners can access a single edge establishment and acquire the selected data intended for them.

To this end, the MANTIS reference architecture extends the IoT reference model with two other aspects, as shown in Fig. 2.

Fig. 2
figure 2

Overview of the MANTIS reference architecture

Firstly, for data processing in the platform, MANTIS invokes a kappa or lambda architecture model [20] that is fitting the current trends in industrial big data analytic processes. According to the generalized lambda architecture pattern [19] defined by industry experts, data can be processed either as soon as it reaches the platform (stream processing), or later on, on demand fetched from storage (batch processing). In here, we are utilizing both stream- and batch-processing technologies described by lambda, to enable the three major maintenance related tasks listed above. This pipelined reference architecture model is implemented for 11 use cases within MANTIS.

The platform tier can be cloud-based, either in a public or local cloud running on corporate servers. The platform tier receives, processes and forwards control commands from the enterprise personnel or systems in general. It consolidates all the above-described processes and analyses data flows from the edge level. It provides management functions for devices and assets. It also offers non-domain-specific services such as data query and analytics. In order to provide the MANTIS partners with a concrete implementation of the MANTIS reference architecture, a reference implementation as shown in Fig. 2 has been developed.

The MANTIS reference implementation relies on the Arrowhead Framework [21] to enable an adequate edge-cloud interoperability. Besides the edge computing interoperability and connectivity issues, especially if real-time control loops are kept, Arrowhead is also used to resolve requirements #c and #d (Sect. 2.1), regarding multi-stakeholder applications. Multiple cloud platforms from various vendors are enabled to access one single production site or edge device in order to get the necessary information for their business purposes.

Additionally to provide access to the platform, the edge broker incorporates translation functionality to the solution. This enables the conversion of heterogeneous data formats and protocols to the requirements of the platform reinforcing the interoperability of the system with different CPS (requirement #c in Sect. 2.1). Interoperability at platform level is achieved by including the functionality of an enterprise service bus (ESB). Among the main features addressed by the ESB in MANTIS it is worth to mention its capability of mediation between communication protocols, data formats, and messaging standards coming from the CPS and storage repositories at platform level. The edge broker also addresses the translation or mapping of data formats and protocols to the reference architecture provided at platform level. The ESB enables the monitoring of those transactions, assures scalability, provides fault tolerance mechanisms or allows the dynamic provisioning of resources.

MIMOSA has been integrated into the MANTIS reference architecture as the information model that provides the data structures that enable the collection of maintenance information (requirement #a in Sect. 2.1). Distributed File System (DFS) storage resources are also available at platform level fulfilling requirement #b in Sect. 2.1.

4 Conclusion

After describing the current environment, expectations and challenges for the domain of CBM, this paper summarizes the MANTIS reference architecture and its reference implementation to give answer to many of the aforementioned challenges. The MANTIS reference architecture provides means for implementing collaborative maintenance by taking advantage of the omnipresence of intelligent devices and the maturity level reached by cloud-based infrastructure. The platform covers data collection from sensors, data pre-processing at the edge level, data flow management, batch- and stream-processing of data, as well as data presentation to application-specific human–machine interfaces. Maintenance-specific analysis is covered through cloud-based data processing, including methods and algorithms for estimating remaining useful life, predicting failure and providing root cause analysis results. The challenges of data representation and object modelling are tackled by the MIMOSA concept, with its Common Conceptual Object Model, and its Common Interoperability Register, and especially the Common Relational Information Schema that provides a means to store enterprise O&M information. Interoperability issues of all involved parties are covered by the support of the Arrowhead Framework. The edge broker enables access to the platform, translate and mediation services and data flow capabilities. The whole platform assures scalability, dynamic resource provision, monitoring and real-time and batch processing mechanisms and tools for intelligent management of industrial operations such as CBM.