Introduction

Cyber-physical systems (CPS) represent more than networking and information technology: information and knowledge are integrated into physical objects. By integrating perception, communication, learning, behaviour generation, and reasoning into such systems, a new generation of intelligent and autonomous systems are to be developed.

A large-scale CPS can be envisioned as millions of networked, smart devices, sensors, and actuators being embedded in the physical world, which can sense, process, and communicate the data all over the network. The proliferation of technology-mediated social interactions via these highly featured and networked smart devices has allowed many individuals to contribute to the size of the Big Data available. The data generated by CPSs are contextualised, which helps transform data into information. This makes CPSs, in the context of Industry 4.0, a huge source of information that includes, often implicitly, relationships about the environment and the working domain. This information and relationships are a potential source of knowledge that needs to be extracted, formalised and, potentially, reused. To be able to implement this knowledge extraction, it is necessary to study in depth the characteristics that the systems under examination must and can have to characterize the methods according to their potential. To this point, this study, completes the work of the same authors [1] on CPS characteristics and their frequency of appearance in literature to detect trends and gather the current opinion of researchers regarding CPS characteristics. Consequently, this work focuses on the various meta-models presented in the literature to extract the most-studied characteristics while presenting a meta-model that can satisfy all modelling needs.

The work is structured as follows: (a) the following section describes the adopted methodology in detail; (b) previous surveys in the context of CPS metamodels are examined in “Previous studies on CPS metamodels and their applications in real world”; (c) a descriptive background on formal concept analysis (FCA) will be presented in “Background” together with a discussion on the CPS characteristics; (d) “Clustering assessment on characteristics using the formal concept analysis method” presents an analysis of the characteristics through the help of FCA, which enabled the clustering of the inspected CPS characteristics; (f) The subsequent section presents the discussion on the results of the analysis; (g) finally, the conclusion and future works are presented in “Conclusions”.

Methodology

The methodology of this study has two main sections. It starts with a state-of-the-art based on cyber-physical systems metamodels, and the CPS characteristics represented in the scientific papers. Then, a formal concept analysis (FCA) will run on the results of the state of the art to reveal and discover any hidden relationship among the characteristics.

The focus of the state-of-the-art was based on CPS knowledge representation in different scientific papers. To do so, a sequence of questions have been answered throughout the work:

  • RQ (1) ‘How CPS metamodels are described and characterized?’

  • RQ (2) ‘How is Knowledge represented in CPS metamodels?’

Consequently, papers were identified through a structured keyword search on major databases and publisher websites (Scopus, Elsevier and ScienceDirect). The research statement was set by using the keywords “Cyber (-) Physical system” AND (“Metamodel” OR “Meta-model”) to have the first pool of the articles. All the searches were applied in the “Title, Keyword, Abstract” field.

As the first step, articles were categorised as included and excluded. In this step, the articles abstract, title and keyword were screened, and they were decided to be included or excluded from the study based on the exclusion criteria (EC) below:

  • EC(1): entire conference proceedings

  • EC(2): articles that do not develop a (meta) model of CPS

  • EC(3): articles that do not represent or study a CPS characteristic

In the second step, a content analysis was conducted, in which the full text of the included articles was read. In this phase articles were excluded based on the following exclusion criteria:

  • EC(4): non-English articles

  • EC(5): articles whose full-texts were not available.

A systematic analysis was run to assess the included articles in terms of what CPS characteristics are discussed explicitly or implicitly. Consequently, a snowball analysis was done through reference scanning of the included articles to identify articles with CPS characteristics discussed. A schematic view of the literature review is shown in Fig. 1.

Fig. 1
figure 1

Literature review process

Previous Studies on CPS Metamodels and Their Applications in Real World

CPSs have been widely discussed in the literature [2] thoroughly discusses the characteristics and architecture of CPS and then investigates different research on the information processing of CPS, CPS Software Systems, CPS System Security and CPS System Testbed. Studying all, they conclude that the biggest challenge in the development of CPS is the limitation on existing theory and technology of computation, communications, and control technology. After an investigation of the structure of CPS, [3] makes a comprehensive search of different domain applications of CPS such as handling energy, network security and data transmission and management. Afterwards, they briefly explored the models and methods driven for the development of CPSs, domain-specific modelling (DSM), prominent model-driven development (MDD) and model-integrated computing are a few to mention. Adversely, the importance of smart healthcare cyber-physical systems (SHCPS) is discussed in [4] as the COVID-pandemic in 2020 has shown the urge for continuous communication and information exchange of physiological data through efficient external monitoring and control of patients and medical equipment. They have defined five different levels of healthcare cyber-physical systems: (1) Unit level, which is the basic level for HCPS that can provide monitoring and control for patients in intensive care units or at the hospital level; (2) integration level, which integrates hospital and smart homes to provide remote healthcare and monitoring to the patients; (3) system level, which forms a smart city healthcare by CPS through several autonomous CPSs. It creates a smart grid with smart homes, smart hospital, smart ambulance and even smart manufacturing units to provide a smart quality healthcare system to patients; (4) acceptance level, where coordinates different researchers, technologists, engineers, health experts, academicians to help define effective policies and implement a successful ecosystem; and (5) evolutionary level, with self-adaptability and self-management characteristics that can learn from the past data in the healthcare system and behave in current scenarios. In general, due to the structure of the SHCPS, characteristics like level of autonomy, security and reliability are discussed since they construct the backbone of the healthcare systems and are desired by the CPS structure. In another work, cyber-physical Production Systems (CPPS), their design and application are the focal point of the study ran by [5]. The 5C architecture of CPS (Smart Connection Level, Data-to-Information Conversion Level, Cyber Level, Cognition Level and Configuration Level) is also deeply discussed regarding the CPPS. Considering CPPS again, [6] presents a metamodel-based CPPS trying to integrate information from different software into comprehensive models of CPPS. The integration benefits from flexible interoperability alignments among networks. The created integration inside the networks consequently effectuates collaboration of production systems avoiding time-consuming and deadlock-prone semantic standardization efforts.

On the other hand, [7] categorizes the application domain of CPS into 10 main categories and discusses the work done in each category. Agriculture, education, energy management, environmental monitoring, medical devices and systems, process control, security, smart city and smart home, smart manufacturing and transportation systems are the 10 groups CPSs discussed in the mentioned work. To name a few of the many examples, they mention the work of [8] through which a “Rat Detection system” (RDS) was developed to help monitor rats in the agriculture field. This CPS-based system reduces the costs of rat control, crop waste and environmental contamination. In the energy management sector, they pointed out the work of [9] who designed a CPS application for the Energy Management Framework (EMF). The designed CPS collects the real-time power consumption demand and status from an autonomous electric vehicle (AEV) and the charging station in a smart grid. This EMF hierarchical network architecture minimizes the energy consumption of wireless sensor networks (WSNs) for optimizing the power supply and distribution. In the process control field, [10] offer a control-theoretic software to monitor solutions for coordinating time predictability and memory utilization in runtime monitoring of systems that interact with the physical world. In the other category, smart manufacturing, the work of [11] has been mentioned, in which they try to develop a flexible, modular and distributed control architecture for automated warehouse systems using Function Blocks and a CPS perspective in the category of intelligent transportation, the work of [12, 13] in traffic management in transport engineering is introduced. Thanks to the intelligent cyber-physical road systems the automatic collection of traffic data was possible so they could measure the number of vehicles traveling from one geographical location to another.

Background

Background on Formal Concept Analysis

Formal concept analysis (FCA) is based on the lattice theory [14] and was proposed by a German mathematician, Wille [30] in 1982. FCA is a formal context to represent the relationship among concepts and attributes. It is indeed a mathematical theory for handling concepts and their hierarchies [15]. FCA is best used for knowledge representation, data analysis, and information management. It detects conceptual structures in data and consequently extraction of dependencies within the data by forming a collection of objects and their properties known as attributes [16, 17].

The concept lattice involves different nodes where each node is a formal concept consisting of two parts: an extent and an intent. An extent is described as an object set in the concept domain while an intent is the set of attributes that are shared by the objects in the object set. Accordingly, a formal concept is a collection of objects with some common attributes, to that point, what is called finding formal concepts in a lattice is in fact clustering the objects within the object set [15].

Formal context is a triple K = (U, M, I), where U is a set of objects (or samples), M is a set of attributes (or features), and I ⊆ U × M is a binary relation called indices incidence that expresses which object has what attribute. For any x ∈ U and m ∈ M, (x, m) ∈ I represent that the object x has the attribute m.

A formal context can be easily represented by a table (see Table 1), where the rows are headed by the object names (here as \({x}_{i}\)) and the columns headed by the attribute names \((\mathrm{here as }{m}_{j})\). For example, a cross in row 2 and column 3 means that \({x}_{2}\) has the attribute \({m}_{3}\).

Table 1 An example of a formal context K

To discover any hidden relationship among the attributes, FCA employs association rule mining (ARM). FCA first constructs the formal contexts by the sets of objectives and their attributes. Using this formal context, it extracts underlying information with the creation of the concept lattice and then by applying ARM, it detects regularities between attributes in large data sets and tries to introduce patterns for attributes which has been seen together frequently [18].

Let \(I{ } = { }\left\{ {i_{1} ,{ }i_{2} ,{ }.{ }.{ }.{ },i_{n} } \right\}\) be a set of n binary attributes called items. Let \(D{ } = { }\left\{ {t_{1} ,{ }t_{2} ,{ }.{ }.{ }.{ },{ }t_{m} } \right\}\) be a set of transactions called the database. Each transaction in \(D\) has a unique transaction ID and contains a subset of the items in the \(I\). A rule is defined as an implication of the form \(X{ } \Rightarrow { }Y\) where \(X,{ }Y{ } \subseteq { }I\quad {\text{and}}\quad X{ } \cap { }Y{ } = { }\emptyset .\) The sets of items (for short itemsets) \(X{ }\) and \(Y{ }\) are called antecedent and consequent of the rule [19]. The defined rule can mean that if \(X\) is chosen then it is likely that \({ }Y\) is also selected.

To better extract rules, some measures are defined in the FCA-based ARM. The best-known measures are Support and Confidence, which are the main measures employed in the present study.

The support supp(X) of an itemset X is defined as “the proportion of transactions in the data set which contain the itemset”. For example, if the support of itemset X is 0.4 it means that the itemset occurs in 40% of all transactions. On the other hand, the confidence of a rule is defined \({\text{conf}}\left( {X{ } \Rightarrow { }Y{ }} \right){ } = {\text{supp}}\left( {X{ } \cup { }Y{ }} \right)/{\text{supp}}\left( X \right)\) and can be interpreted as “an estimate of the probability \(P(Y{ }|X)\), the probability of finding the antecedent of the rule in transactions under the condition that these transactions also contain the consequent”. For example, if the \({\text{conf}}\left( {X{ } \Rightarrow { }Y{ }} \right) = 0.5\), it means the rule \(X{ } \Rightarrow { }Y\) is correct in 50% of the transactions containing \(X{ }\) and \(Y\) [19].

FCA-based ARM can be a very helpful method in recognising the patterns as it: (1) extracts all the association rules from a given data without redundancies; (2) generates the rules faster and more efficiently; and (3) discovers more significant rules [18].

Background and Study on CPS Characteristics

CPSs are often engineered systems and are differentiated from other types of engineered systems as they are built on the integration of cyber and physical components. It is, therefore, agreed upon that CPS functionalities come from the tight integration of the cyber and physical sides and create CPS characteristics in different terms. On the other hand, CPSs should be characterized by well-defined components. They should provide components with well-known characteristics described using standardized semantics and syntax. Therefore, defining and shaping key characteristics of CPSs will pave the path to better development and implementation management within and across various domains of CPS applications [20]. Considering the above, and exploring how CPS metamodels are characterized and defined, the focus point of the present study has been put on exploring the CPS characterises in various domains in scientific papers.

Napoleone et al. [21] discussed the technological characteristics of CPSs in manufacturing emergent from existing literature in detail. They carried out a structured review to investigate the CPS characteristics that have been studied in scientific papers. In the end, they came up with the 19 most cited lower-order characteristics and then provided their literature-based descriptions and, explaining their reasoning, aggregated them to eight higher-order characteristics. A base CPS characteristic list was considered on account of their work aiming at delineating CPS metamodels. Therefore, the choice of content analysis for our work was established as deductive. However, during the procedure of analysing the papers and digging deeper into the study, the list of the characteristics that were gone through for the analysis was modified to what can be seen in Table 2.

Table 2 Studied CPS characteristics and their definition in the literature

Clustering Assessment on Characteristics Using the Formal Concept Analysis Method

To answer the two research questions, “How CPS metamodels are described and characterized?” and “How is Knowledge represented in CPS metamodels?”, scientific papers were gone through whether they discuss, implicitly or explicitly, the CPS characteristics given in the last section. Hence, formal concept analysis (FCA), as a clustering technique, was chosen to help us first to describe the CPS metamodels and then scrutinize the CPS characteristics and the hidden relationship between them in the chosen papers.

FCA has been discussed previously in the background section. It has been mentioned that it detects conceptual structures in data and consequently extraction of dependencies within the data by forming a collection of objects and their properties [17]. The FCA method starts with a formal context as shown in Table 1 where the input data will form a matrix, in which each row represents an object from the domain of interest, and each column represents one of the defined attributes. In the present study, the formal contexts are formed by including articles as the objects and the CPS characteristics as attributes. If an article has, implicitly or explicitly, investigated the CPS characteristics in their metamodel a “×” is input in the cell. Otherwise, the cell remains empty. Table 3 represents the formal context prepared for the analysis in this work.

Table 3 The formal context

In general, FCA results in two sets of output data: a hierarchical relationship of all the established concepts in the form of a Hasse diagram called a concept lattice and a list of all interdependencies found among attributes in the formal context. The latter is what was used for the analysis of the CPS characteristics in this work. As explained previously, FCA uses the formal context to extract information and detect regularities between attributes so that it can introduce patterns for attributes which has been seen together frequently. In the present study, FCA resulted in regularities among CPS characteristics, detecting what characteristics that have been studied more frequently (single clustering) and also the characteristics that have been regularly studied or used together (double clustering).

Figure 2 represents the result of FCA on single clustering of CPS characteristics. As it is seen, “Resiliency” was the one characteristic that stood on the top of the list, with a noticeable difference from the rest, as the most reflected characteristic in the literature whether to be explicitly or implicitly mentioned. Characteristics like “Fault-Tolerant”, “Diagnosability”, “Redundancy” and “Safety and Security” come next in the list with a noticeable difference from Resiliency and ignorable divergence among themselves. On the other hand, characteristics like Reconfigurability, Collaboration, Controllability, and Self-capabilities are at the end of the list, which does not refer to the lack of importance on the characteristics though. The main reason might mostly be that they are the characteristics that are fundamental and taken for granted in the design and application of CPSs.

Fig. 2
figure 2

Single clustering of CPS characteristics

Figure 3 shows what was extracted from the coupling demonstration of characteristics in the analysed papers through FCA. Going through the results, the combination of Resiliency with other characteristics is the one observed the most, which was somehow predictable by the analysis of the single characteristics. However, the pair of {Resiliency; Redundancy}, {Resiliency; safety and security}, {Resiliency; Fault-Tolerant} and {Resiliency; diagnosability} are at the top ranking, respectively which one way or another can show a close relationship between the concepts; the outcome that establishes the backbone of the upcoming discussion.

Fig. 3
figure 3

Double clustering of CPS characteristics

As discussed previously, FCA uses association rules to help detect patterns among regular attributes. Moreover, the measures of support and confidence have been introduced to check the reliability of the detected patterns and also their probability of appearance.

To serve this purpose, the software LATTICE MINER 2.0 was adopted as the result of the analysis done. The association rules between the selected CPS characteristics were extracted considering the minimum support level as 20% and minimum confidence level as 20% as shown in Table 4. The minimum levels were defined by a try-and-error procedure.

Table 4 Association rules between the CPS characteristics

Looking through the association rules, the probability of achieving resiliency through fault tolerant, diagnosability, safety and security and finally redundancy goes over 84% which itself confirms the result for the first step in FCA. It is also worth noting that, resiliency is in all the itemsets that have support levels above 20% and a confidence of 50% and above.

Discussion on the Results

Concerning the results of FCA achieved in the previous part, resiliency draws the most attention to itself among other characteristics. Different terms were used and established in the literature for a CPS that has “resilience” such as survivable [104] or Fail-safe [114].

Furthermore, the present study investigated the concept, whether it was explicitly or implicitly discussed in scientific papers. To name a few, [22] tried to reach resiliency by modelling the functions and also the links between the components of the metamodel with the help of FCA. Looking at the hierarchical inclusion of the CPS metamodel and thanks to the created lattice, they could find control over redundancy and therefore elevate the resiliency of the system. Sangiovanni-Vincentelli et al. [91] addressed the systems engineering of cyber-physical contract-based design by employing structured and formal design methodologies to finally increase the reliability and consequently the resiliency of the CPS meta-model. Although [71] did not mention resiliency directly as an objective of their study, they have had it implicitly targeted through an integration of the physical layer, the network layer and the business layer. The integration at the end leads to a better investigation of the hardware status information, software, patches and other information for perception, acquisition and control. The integration results in a platform by which the controllability, diagnosability and fault-tolerant of the CPS are increased which will be directed to more survivability of the system.

Given the importance of the concept, different paths were taken to reach and increase the resiliency of a CPS. Due to the results observed, the main two tracks were used more frequently than the two characteristics: ‘safety and security’ and ‘fault-tolerance’. For example, [124] believes that only by unifying safety, security and resiliency it is possible to reach adaptable and dynamic design patterns that can take into account the intended functions of a system. [114] explored fault tolerant control systems (FTCS) and mentioned that they can withstand the failures and errors of the components of the system itself and preserve the system performance to the maximum, therefore, they can survive and be resilient.

Digging a bit deeper, the resiliency of a system was thrown together with recognizing different challenges and risks along with defining proper metrics to protect the endangered system and estimating plant states despite attacks [22, 112]. Observing the trend illustrates different efforts to elevate the resiliency of the system: through characteristics like predictability and diagnosability which also stood at the high ranks of the FCA double clustering.

Redundancy and reliability were also the characteristics that coupled well with resiliency in FCA and were also discussed closely with the concept in the literature. As mentioned by [112], redundancy is the principle that can be advantageous in estimating resiliency in the majority of systems. On the other hand, the intention of redundancy in the system can be to increase its reliability since it relies on employing multi-pronged solutions rather than a single technique which also improves the security and resiliency of the system [22].

In addition to all, stability was also a characteristic that was paid attention to in reaching safety, security and consequently the resiliency of the system since fast reconfiguration of attacks can lead to maintaining the stability of the system which keeps it safe and helps it retain normal operation [99].

A Schematic Hierarchy of the CPS Characteristics

Based on the observations and the FCA outcomes, the characteristics “resiliency” was chosen as the most used and applied among the others. Therefore, we used the discovered implicit relationships among the characteristics (thanks to FCA-based ARM) to design a hierarchy of characteristics to map achieving resiliency of a CPS. The hierarchy illustrates different levels of characteristics that directly or indirectly influence resiliency, from built-in characteristics that cannot be ignored in the design of a CPS to the ones that have a non-breakable effect on the top level.

As depicted in Fig. 4, the hierarchy represents 4 different levels:

  • the top level (Level 1) is dedicated to the characteristics that were used the most frequently among the ones investigated in the scientific papers, i.e., resiliency.

  • Level 2 is representing the second-graded characteristics that have a linear connection with characteristics in Level 1 and a direct influence on it, i.e., Safety and Security and Fault-tolerance.

  • Level 3 portrays the characteristics that helped the CPS reach the second-graded ones in Level 2 and consequently to resiliency as in the first level.

  • Level 4 illustrates the bottom-line characteristics that were not directly led to elevating resiliency but are fundamental to forming a CPS, without which the system might not function efficiently.

Fig. 4
figure 4

Hierarchy of characteristics to reach resiliency

As mentioned above, the hierarchy resulted from the study of the hidden relationship among the characteristics of CPS and how they are related to each other. Regarding the FCA results and the consequent association rules (ARM), resiliency was the characteristic that appeared the most while searching for solutions for the survival of the CPS meta-models and a more efficient performance. The proposed hierarchy maps the road of observed characteristics to achieve resiliency in a CPS meta-model from the built-in characters to the top.

Following the results of FCA, the higher-level characteristics in the hierarchy have been redefined using other characteristics that seem to be related to them. The relationship comes from FCA analysis and their positioning in the hierarchy, therefore, the related characteristics contribute to acts, behaviours and consequently definition of the main characteristic. Table 5 captures the connected terms in the redefinition of the selected characteristics.

Table 5 Characteristics redefinition due to the hierarchy

Subsequently, resiliency, as the most frequent characteristic, has been put in words as the capability to keep the safety, stability, and quality of the service in the time of (un)predicted faults and threats and accommodate the networking system with different alternatives to help it operate with no interruption. The definition relates closely to the other two characters pursuing resiliency, safety and security and fault tolerance. As disclosed above, CPS engineering is mostly leaning on keeping the system safe, secure and stable at the time of predicted or sudden attacks or threats. This aim can be reached through having functional or operational alternatives, real-time identification and detection of failures or faults by considering the fundamental characteristics of the CPS located at the bottom order.

Conclusions

The paper continues the previous work of the authors on studying cyber-physical systems and their representative characteristics in the literature. Two research questions were put as the principal of the search, i.e., ‘How CPS metamodels are described and characterized?’, ‘How is Knowledge represented in CPS metamodels?’, through which CPS metamodels were investigated regarding what characteristics they are designed to mirror. A literature review was done focusing on the two research questions to investigate the current opinion in the literature on what characteristics to target more frequently in studying a CPS meta-model. Therefore, articles in the literature were selected based on two main criteria: (1) they study CPS metamodels and (2) they refer to one or some CPS characteristics in their metamodel study. After a two-step literature review, CPS characteristics, implicitly or explicitly discussed, were extracted. Afterwards, Formal Concept Analysis (FCA) as the clustering technique was applied to detect any hidden relationship among the most used characteristics in the articles. Due to the results, “Resiliency” was the characteristic that was targeted the most frequent, implicitly or explicitly, in the scientific papers. “Fault-Tolerant”, “Diagnosability”, “Redundancy” and “Safety and Security” were the ones that followed resiliency in the list but with a noticeable difference.

Thanks to FCA, the implicit bonds between characteristics in the literature were also disclosed which led to a hierarchy of CPS characteristics aiming at reaching resiliency in the metamodels. A new set of definitions for the highly ranked characteristics was also introduced that sheds light on future CPS metamodel designs regarding what characteristics to target and what path to take to be more aligned with the concept of Industry 4.0.

This study was focused on what has been observed in the literature from different researchers in the field and it aims at reporting trends and themes on CPS characteristics. However, there is still a gap in the literature on the cyber-physical systems on whether we can define any dominant characteristic in the development of a CPS metamodel. As a future work, the result of this study can help in the development of measures or indicators in significance assessment of any CPS characteristics.