Keywords

1 Introduction

Integration—the interconnection of the components that comprise a system—is identified as a major source of problems in the concurrent development of complex engineered systems [62]. This is because each component is developed with assumptions and/or incomplete knowledge about other components of the system, which later turn out to be wrong [63].

To tackle these challenges, there is a need for improved development cycles, with better tools, techniques, and methodologies [65]. While modeling and simulation has been successfully applied to reduce development costs, it falls short in fostering more integrated development processes [7]. To see why, note that a model of the complete system is required for simulation, and consider the following obstacles:

  • Accurately simulating a complete system model might be difficult. For example, the transient simulation of digital circuits is difficult because there are sub-circuits whose dynamics change significantly faster than others [49], forcing the simulation to be run at a prohibitively high level of detail.

  • Heterogeneous systems are best modelled with a mix of formalisms [66] or example, consider a power window system [56], present in the majority of the vehicles produced today. It includes both software elements (best modelled with a Statechart like formalism), and physical elements (best modelled with differential equations based formalism).

  • Subsystem models might be costly. In systems that encompass subsystems produced by external suppliers, the licensing costs required to get access to models might be too high, due to the Intellectual Property. For example, consider the exhaust gas recirculation water handling system, reported in [55], where the dirty water is pumped to a water treatment center (externally developed) to be purified and reused. As claimed by the authors, having higher fidelity models of each of the subsystems would allow the engineers to design better control strategies.

  • Models of subsystems might be black boxes. At later stages in the development process, prototypes for subsystems may be coupled to models of the remaining subsystems, to enable global validation of the system. For example, the validation of the power window controller might be done by simulating the controller in a computer, and connecting it to a real motorized window [18], which is considered a black box from the point of view of the controller. Other black boxes include inductive models of subsystems, produced from extensive physical experimentation. For example, an anti-lock braking system controller might be validated against black box wear and tear models of the braking pads, to evaluate its performance when the effectiveness of these subsystems decreases [21].

A prospective concept to address the above challenges, and unleash the full potential of simulation, is collaborative simulation, also known as co-simulation [40]. This concept concerns coupling of models created in different formalisms and makes it possible to simulate the entire system by simulating its constituents and exchanging data between them. Thus, the behavior of a coupled system is computed by the communication of multiple simulation tools, each responsible for computing the behavior of a constituent subsystem [30, 44, 51]. Each simulator is broadly defined as a black box capable of exhibiting behaviour, consuming inputs and producing outputs. Examples of simulators include dynamical systems being integrated by numerical solvers [12], software and its execution platform [16], dedicated real-time hardware simulators (e.g., [34]), physical test stands (e.g., [69, Fig. 3]), or human operators (e.g., [13, Fig. 24], [53, Fig. 6]).

Co-simulation foments a more integrated development process by allowing different teams to observe how their subsystem behaves when coupled to the rest of the system (full system analysis), while reusing the work made by the other teams. Furthermore, it improves the relationship between external suppliers and system integrators, where the system integrators can use virtual surrogates of the subsystems produced by the suppliers, to test their adequacy. With the appropriate Intellectual Property protections, these virtual surrogates can even be provided by the supplier, for increased validity.

In order to run a co-simulation, all that is required is that the participating simulation tools expose the outputs and consume the inputs, of the allocated subsystem over simulated time. The same loose requirements that make co-simulation great to integrate many different simulation tools, also raise difficult challenges.

In the following sections, we explore those challenges by first providing an historical overview of co-simulation, then examples of industrial case studies, and finally the emerging trend.

2 The Facets of Co-simulation: Historical Overview

Co-simulation is not a new concept. Instead, it is the aggregation of multiple research trends that were sparked by the advances in computer simulation techniques, and the increased demands on this field. In the following paragraphs, we summarize some of the main milestones that lead to the facets of co-simulation. Figure 1 situates these in time.

Fig. 1.
figure 1

Timeline of co-simulation milestones. From 1970s up to 2015.

2.1 Late 70s and 80s

To the best of our knowledge, the first discrete event synchronization algorithms were published in the late seventies [39], around the same time that Lamport [42] published his seminal paper regarding the ordering of events in distributed process networks. Discrete event simulators compute the behavior of a system by isolating the most important events and computing the state evolution of the system from one event to the next [23]. The state evolution evolves discontinuously, with each discontinuity being caused by an event. In this paradigm, a coupled system can be broken down into subsystems that exchange events, which then are simulated in parallel, each in a separate process. Since processes run in parallel, and react to incoming events by updating their state, and potentially sending events, it is important to ensure the correct synchronization of the subsystems, so that no event happening at time \(t_i\) is processed by a subsystem which is at time \(t>t_i\).

Around the same time, in the continuous simulation domain, new challenges were being uncovered. The main difference between the continuous and discrete event simulation domain lies in the fact that the state of a continuous system evolves continuously over time. Simulators of continuous systems that run in digital computers cannot compute every point of its state. Instead, they rely in the smoothness of these systems (coming from physical laws), to approximate the state evolution at countable points in time [12]. The fundamental tradeoff is: the closer one wants the time points to be, the more accurate the approximation is, but the higher the performance cost.

In the late seventies and early eighties, as electrical circuits increased in size, their simulation algorithms were becoming a bottleneck in the development process because of the long simulation times. Practitioners noticed that, for sufficiently large circuits, only a small fraction of the subsystems had actively changing voltage levels, at any point in time. This led to the development of simulation techniques that, in a similar way to their discrete event based counterparts, only computed a new state of each subsystem when its outputs had changed significantly [49]. Additionally, to exploit parallelism and reduce numerical instabilities, the waveform relaxation techniques were introduced. In these, during a computation interval \(t \rightarrow t+H\), each subsystem was assigned to a simulator which approximated its solution in that interval, using whatever simulation step size was required to keep the approximation error of that subsystem within tolerance. Then the simulators exchanged the solution trajectories, and were asked to re-compute the same interval, using the updated input trajectories.

These techniques made possible the simulation of large scale circuits because they exploited parallel computers, and naturally supported subsystems with different dynamics: systems which changed slowly where more quickly driven to convergence, and with larger simulation step sizes. Additionally, these techniques were subject to extensive numerical analysis [47], highlighting their interesting theoretical properties.

In the late eighties, the release of the Time Warp Operating System represented the optimistic facet in parallel discrete event simulation. It acknowledged that the performance of a parallel discrete event simulation could be increased by allowing the different processes to simulate as fast as they could, and correcting causality violations. The corrections are made by rolling back the processes to a state that is consistent with the time of the event that caused the violation.

The performance of optimistic discrete event synchronization algorithms was such that it sparked the research into large scale simulations with humans interacting in realistic environments created by collaborating simulators. Developed during the 80s, SIMNET was dedicated to military trainings involving thousands of simulators representing, for instances tanks or helicopters [48]. It encompasses an architecture and protocol to implement the optimistic synchronization of simulators in a distributed environment, with real-time constraints. In order to keep a reasonable level of accuracy and realism, one of the innovations is the concept of dead-reckoning models. A dead-reckoning model is a computationally lightweight version of some other model, whose purpose is to be used by interested simulators when there is a failure of communication, or when the synchronization times are far apart.

2.2 90s

In the early nineties, coordination languages emerged (e.g., Linda [6], Manifold [4]). These focused on the specification of the interaction between different parts of the system. According to [25], “Coordination is the process of building programs by gluing together active pieces”. A system designer defines one or more coordination model(s) to specify how the system models interact with each other.

During the same period, the software architecture research field proposed languages to abstract, structure, and reason about complex systems. One example is the Architecture Description Languages (ADL) [24]. An ADL description usually specifies a system in terms of components and interactions among those components. Such languages helped (1) to clarify structural and semantics difference between components and interactions, (2) to reuse and compose architectural elements, (3) to identify/enforce commonly used patterns (e.g., architectural styles).

Coordination languages and ADLs have common objectives [52]. They build/understand/analyse a system based on “components” possibly written in different languages and connectors (which include the specification of the interaction/coordination).

In 1990, United Airlines ordered 34 Boeing 777s, the first aircraft to be developed with concurrent engineering [37, 38]. The design was communicated fully in digital form, later aptly named a DMU (Digital Mockup Unit [3]), using CAD tools to showcase the different views of the system. This central repository of information served many purposes: (i) every team could consult the specifications of the subsystems made by any other team; (ii) simulations could be carried out periodically, to detect problems in the design; (iii) both the assembly and maintenance phases of the system could affect the design phase, by running simulations of repairs and assembly.

This milestone represented an increase in the information that is taken into account for the design of the product. It now did not come only from requirements, but also from other stages of the life-cycle of the system: manufacturing, assembly and maintenance. The milestone also highlights the many different purposes for which models of systems have to be available, and new kinds of simulations.

As digital circuits became more complex, they comprised microprocessors running software. This field spawned the need for hardware/software co-simulation [57], highlighting the heterogeneity facet. Before using co-simulation, software developers had to develope their code with little information about the underlying hardware, leading to painful integration efforts later on. Thanks to the coupling of circuit emulators and the software execution, they were able to quickly identify miscommunication errors before building hardware prototypes.

In the field of physical system simulation, researchers realized that there should be a standardized way of representing physical system models, so that there could be easily coupled to form complex systems [50]. This was called the DSBlock (Dynamical System Block) standard [50]. This proposal later inspired a widely adopted standard for co-simulation: the Functional Mockup Interface standard. While the composition of DSBlocks still needed a solver, and is therefore not strictly considered co-simulation, this was a milestone in highlighting the need for standardization for continuous system co-simulation, which was also identified as a research priority [67]. SIMNET evolved into the DIS (Distributed Interactive Simulation) standard [35], for discrete event based co-simulations.

As embedded systems were enhanced with communication capabilities, researchers noticed that the simulation of these distributed systems should not always be run at the same level of detail. Instead, the designers should be able to choose the level of detail they wanted for each embedded system: from the highest level of detail (circuit simulation), to the lowest (software simulation). This highlights the facet of multi-abstraction co-simulation, and identified the main issues in coupling simulators that were in different levels of abstraction.

2.3 2000s

The early 2000s was marked by multiple reported applications of co-simulation being used in industrial case studies [5, 43]. These had in common one facet: two simulators were coupled, each specialized in one domain, in a feedback loop. For example, in [5] the authors reports on the study of the interaction between the pantograph (a mechanical structure on top of a train, connecting it to the electric grid), and a catenary (over hanging cable that transmits electricity to the train). A flexible body simulator was used to compute the behavior of the catenary, and a multi-body simulator was used for the pantograph. In the meantime, the DIS standard, and its protocols, were generalized to non-real time applications, in what became the HLA (High Level Architecture) standard [1].

In order to ensure the correctness of coordinated heterogeneous model simulations, the Ptolemy and the Modhel’x projects proposed to expose some information about the behavioral semantics of languages (named Model of Computation) [9, 20]. Then, they defined adaptations so that they could be co-simulated.

In 2008, the MODELISAR project published the FMI (Functional Mockup Interface) standard [7], whose essential contribution to co-simulation was the concept of Intellectual Property protection. It was an evolution of the DSBlock proposal, but recognizing that each subsystem might need its own simulator. This standard is widely adopted in industryFootnote 1 [58], where the simulation of externally supplied components can be costly due to high licensing costs.

Although there was some research about the coordination of black-box physical system simulators before the FMI Standard was published (e.g., [5, 31, 41], and other references in [29]), it does not standardize the synchronization protocol between simulators. The main reason is that, as in continuous system simulation, there is no one-fits-all simulation algorithm. This is in contrast to discrete event simulation, where the implementations of the DIS and HLA standards provide everything to run the co-simulation.

2.4 2010s

The current decade is marked by several applications of co-simulation across many domains (see, e.g., [29, 59]), the Digital Twin [26] concept, and an effort to systematically study co-simulation, with the publication of surveys [30, 32].

The Digital Twin extends the DMU concept not just to the design and assembly phases of the system, but also to the maintenance. The essential idea is to use high fidelity models of the system, calibrated from sensory information collected during its operation, to affect how the system should operate, predict failures, schedule maintenance, etc.

3 Applications

3.1 Exhaust Gas Recirculation (MAN Diesel and Turbo)

MAN Diesel & Turbo (MDT) is one of the largest producers of two-stroke combustion engines with distributed embedded control system. Due to new emissions legislation on NOx, the systems that reduce the emission of this gas need to be improved. Since the development is split between different departments, using different tools, with limited sharing of models, co-simulation was applied to maximize reuse of models [55].

The work in [55] describes an exhaust gas recirculation system, and a water handling system. The purpose is to clean and recirculate exhaust gas to a ship engine intake manifold. The exhaust gas is cleaned by spraying water into it, and allowing the mixture to cool down and flow into a receiving tank. Then, the (dirty) water is pumped to a water treatment center (externally developed) to be purified and reused.

The initial approach consisted of developing the control system in an in-house application framework, that simulated both the control system and the physical models of the ship engine. While the traditional setup allows for simulation, the physical models are often implemented at a lower level of detail than e.g. Matlab/Simulink® models. The co-simulation approach, based on the FMI standard, coupled the in-house application to MATLAB, so that higher fidelity physical models could be used. They believe that, had this approach been used from the start, then a water tank overflow problem could have been discovered before running the software on an expensive engine test bench.

3.2 Driverless Lawn Mower (Agro Intelligence)

Another application of co-simulation is the development of a steering controller of an industrial size driverless lawn mower [22]. Besides aiding in the development of the control and navigation system of the lawn mover, co-simulation was applied to investigate alternative designs that would otherwise be both costly and time-consuming to test with physical prototypes.

The co-simulation scenario consisted of three parts: a simulator representing the vehicle dynamics, a simulator representing the control algorithm and a simulator to convert values between the two. Additionally, each alternative design was projected in a 3D animation based on the game engine Unity, that it could be visually inspected by designers and clients.

To make sure the co-simulation results were valid and accurate, an initial prototype was conceived and tested. Afterwards, multiple designs were evaluated with co-simulation, to find the optimal look-ahead distance and velocity. The simulation results for multiple look-ahead distances, and fixed velocity, are shown in Fig. 2.

Fig. 2.
figure 2

Simulated trajectories for look-ahead distance with velocity 1 m/s [54]

4 Emerging Trend and Challenges

4.1 Towards Full Virtualization

Throughout the history of co-simulation, a common trend emerges: a gradual shift towards the virtualization of not just the design of the system, but also assembly, operation, and maintenance.

The virtualization of the design of the system has been one of the primary uses of co-simulation, backup by concurrent engineering processes.

The virtualization of the assembly reflects an increased demand in the information that should be taken into account at the design phase, with concepts like the Digital Mockup Unit.

Complex systems that need interaction with human operators require training interfaces. Marked by military training simulators, the virtualization of operation refers to the creation of complex training environments at almost no cost by leveraging the same co-simulation scenarios used in the design phase. As an example towards this future, we highlight the design of a motion compensated crane [14], by ControlLab, where the crane operators are trained using a virtual reality environment (see Fig. 3).

Fig. 3.
figure 3

Taken from [14].

3D real-time simulation of a motion compensated crane. 

Finally, extending the lifespan of systems, and reducing their downtime through the virtualization of their maintenance, is becoming a priority. This means that co-simulation can be combined with advanced sensors to create smart monitors (Digital Twins) that predict failures.

4.2 Challenges

The historical overview, and main trend identified, can be used to highlight some of the challenges that researchers and industry will need to overcome in the upcoming years.

We divide these challenges into four categories: Design Space Exploration (DSE), X-in-the-Loop Co-simulation, Incremental Testing/Certification (IT/C), and Education.

Design Space Exploration consists of the systematic analysis and evaluation of different designs over a parameter space. When the evaluation involves running a co-simulation, then ensuring that co-simulations can be run quickly, accurately, and respecting the physical constraints of the system.

Since the results of these simulations are typically not inspected by experts, it is crucial that these can be trusted. To this end, we highlight the need to ensure that each configuration of the system is valid, and the need for the co-simulation to preserve any properties that the configuration of the system satisfies.

Validity refers to whether the composition of subsystem models (induced by the co-simulation scenario) reflects a physically meaningful coupled system [17, 70]. This property is important because physical system models have many implicit assumptions, and their combination may violate those assumptions, purging their predictive value. For example, in [60] the authors ran a questionnaire through several experts in various domains of physics, asking them to identify the implicit assumptions in a simple model of a particle moving in a viscous medium. No expert was able to identify all the 29 assumptions, identified by their combined expertise.

The evolution of many engineered systems can be summarized by their evolution from one equilibrium to another [19], and it is important that their corresponding co-simulations reflect this property. While analyses have been developed that enable the automated verification of this property for continuous co-simulations (see [30, Sect. 4.3] and references thereof), there are many open challenges with the co-simulation of hybrid systems [27], and adaptive co-simulations [28].

X-in-The-Loop refers to co-simulations that are restricted in time and computing resources, due to the presence of human operators, animation requirements, or physical subsystems. In this context, there is a need for simulators which can provide contracts with timing guarantees on their computation time, based on the inputs and parameterization.

IT/C consists of the co-simulation activities that are applied as part of concurrent engineering activities, where the models of each subsystem are refined over time and integrated frequently. We highlight the need for co-simulations that provide formal guarantees on the accuracy of the behavior that is computed. Since the definition of correct co-simulation is elusive and depends on the domain of application, each simulator should provide some form of contract. It should be possible to obtain an abstraction of each simulation units that is appropriate to the kind of contracts defined. Existing research could be used as a starting point [8, 11, 36, 46].

Once each simulator provides formal guarantees, then the orchestration algorithm should ensure that the composition of those contracts, and other formal properties, can be satisfied. As highlighted by works on heterogeneous simulations and more recently in [45], the way to orchestrate the different simulators can lead to incorrect results. This is especially true when discrete models (with frequent and natural discontinuities) are in the loop since a minor change in timings can result in different behavior (let consider for instance a double click versus two consecutive clicks).

To illustrate, consider a simulator that guarantees that there are no more than one discontinuity every 10 s. Then, depending on similar contracts satisfied by other simulators, a similar kind of contract could be satisfied by the co-simulation.

Education refers to those challenges that are of non-technical nature, but are nonetheless crucial to attain the full virtualization vision.

In order for companies to adopt co-simulation there are several concerns that hinder the theoretical possibilities from being employed in practical setting. One of these is the protection of intellectual property, which limits the information that is available for a given simulation unit. It is not an issue in itself, but it is an issue when considering other desirable properties of co-simulation, e.g. performance. For example, [61] describes two master algorithms, one that allows parallel computation but is limited in its applicability, and another that is less limited in applicability but requires a sequential execution. However, the information required to choose the optimal master algorithm in this case is not available. Similarly, [64] concerns precompiling a master algorithm optimised for a given scenario, but this also requires information, that is not available in a black box implementation.

Another challenge is related to the current co-simulation standards. This is described in [10], which puts forth several requirements for hybrid co-simulation, such as superdense time, and relates them to the FMI standard. In general, time representation is a very important aspect of co-simulation, and [15] presents several extensions to FMI. One of these is that in theory several theorems uses real numbers, which has infinite precision. However, these are often represented as numbers with finite precision.

Finally, proper integration with existing development processes. Co-simulations are initiated by different users with different backgrounds. This is not just about pushing a button and getting results: there is a need to integrate robust co-simulation frameworks into existing tools, such that each different kind of user can use the most comfortable tool as a front end to run the co-simulations, and that user understand what he is doing. To this end, education and technology transfer are crucial steps.

5 Conclusion

Co-Simulation holds the promise to unleash the full potential of simulation. However, it is not a new concept. In this paper we present the historical events that resulted in what is today known as co-simulation. These highlight a trend towards the virtualization of every interaction with complex systems. Based on this trend, we identify several exciting challenges that lie ahead.