6.1 Introduction

In recent years, a new type of industry is emerging that aims to be more adaptable, agile, and flexible. This industry called “Industry 4.0” promises to adapt to the personalized needs of customers, thanks to the integration and generalization of new Information and Communication Technologies (IoT, Big Data, RFID, Digital Twin, etc.) into the production system such that new features can emerge:

  • dynamical adaptation to the high market volatility and the need for tailor-made product solutions.

  • Communication with other systems and their environment.

  • Distributed intelligence: each component is able to sense and to decide.

However some enablers are needed to support the realization of this new paradigm (Panetto et al. 2019). In particular, mass customization in shorter and shorter delay leads to a difficulty in knowing the quantity and type of demand, the flow of products and their fluctuations and thus increases perturbations into the production model due to the great diversity of the manufactured products and their shortened life cycle. Perturbations can be categorized as follows: (1) Uncertainties: as the difference between predicted and actual information (uncertainties about the volume of demand, the duration of operations, etc.); (2) Hazards are defined by the occurrence of uncontrollable Event in production or in the environment (machine failure, urgent order, etc.).

To incorporate perturbations into the problem, different types of models exist in the literature. (Ierapetritou and Jia 2007) have listed the three common models for integrating perturbations into production models: delimited form or scenario description, probability description or stochastic models, and fuzzy modeling.

The main disadvantage of stochastic models was the need to have knowledge about historical data for identifying the right probability distribution and its parameters. However, Industry 4.0 and integration of big data technologies promise to have access to data coming from the shop-floor such that this historical data and their analysis should help to build the right stochastic model of perturbations.

Such models can be integrated into classical Operations Research Models for Robust Optimization (Bertsimas and Sim 2004). The Operations Research models and associated solution tools are particularly efficient for tending towards the optimal solution despite the complexity of the problem. But the counterpart of this efficiency is often a dedicated static model whose price of adaptation when considering a new characteristic can be very high. By essence, production systems in Industry 4.0 will be highly dynamical and reconfigurable. Discrete Event Systems (DES) models and tools are particularly efficient to capture and model the dynamics of a system through the modeling of states and Event (Cassandras and Lafortune 2009).

The objective of this chapter is to present a generic methodology to assess the impact of perturbations into production systems in order to define a solution with a good balance between performance and robustness. This methodology is the first milestone for combining the advantages of robust optimization and Discrete Event Systems models and tools. The idea beside is to iteratively call robust optimization and Discrete Event Systems models for reaching the robustness level required by the decision-maker. This methodology is shown to be relevantly applied in the context of robust production scheduling when considering uncertainties on operation execution durations.

This chapter is built as follows: the first section presents the generic hybrid approach between Operations Research models for robust optimization and Discrete Event Systems models and tools for property verification. The second section presents the instantiation of this methodology to a scheduling problem under perturbations in a workshop with parallel machines. The third section illustrates and discusses the results on a use case. Finally the last section concludes the chapter by recalling the obtained results and by opening the discussion considering general considerations about perturbations and Industry 4.0.

6.2 A Hybrid Approach for Optimization Under Perturbations

In this section, we begin by introducing Linear Programming and robustness, then Discrete Event Systems concepts are also presented. We finish by describing the proposed methodology to deal with the robustness level wanted by the decision-maker for a solution, thanks to a combination of a robust linear programming approach and Discrete Event Systems Models.

6.2.1 Linear Programming and Robustness

Linear programming is one of the most powerful tools in Operations Research. It allows to model a wide variety of practical problems (particularly in logistics) and is often able to solve them to optimality. Among these logistic problems, we can quote scheduling, production planning, vehicle routing, time tabling, etc.

According to Papadimitriou and Steiglitz (1998), a Mixed Integer Linear Programming (MILP) can be expressed as:

$$\displaystyle \begin{aligned} & \mathrm{Maximize}\quad {\sum_{j=1}^n c_jx_j}{} \end{aligned} $$
(6.1)
$$\displaystyle \begin{aligned} & \mathrm{s.t.} \\ & {\sum_{j=1}^{n} a_{ij}x_{j} \leq b_i}\quad {\forall i=1,\cdots,m}{} \end{aligned} $$
(6.2)
$$\displaystyle \begin{aligned} & {x_{j} \in \mathbb{N}}\quad {\forall j=1,\cdots,p}{} \end{aligned} $$
(6.3)
$$\displaystyle \begin{aligned} & {x_{j} \in \mathbb{R}_{+},}\quad {\forall j=p+1,\cdots,n}{} \end{aligned} $$
(6.4)

where

  • (c j)j=1…,n, (b i)i=1…,m, \((a_{ij})_{(i,j)=\left \{1\dots ,m\right \}\times \left \{1\dots ,n\right \}}\) are real variables which represent the problem’s parameters (for instance, costs, distances, capacities, etc.).

  • X = (x j)j=1…,n are the decision variables. They represent the solution we seek to determine.

  • Function (6.1) is a linear form which represents the criterion we seek to optimize (in this case, maximize. For instance, it can be some logistics costs, customer’s satisfaction, etc.).

  • Equation (6.2) is a set of affine constraints that any solution of the modeled problem must satisfy (it describes the problem specificities).

  • Equations (6.3) and (6.4) are integrity and positivity constraints.

For more information about linear programming, readers can also refer to Chvátal (1983), Wolsey (1998), and Nemhauser and Wolsey (1999).

Usually, when using such modeling, all parameters are assumed to be well known and deterministic. Nevertheless, this situation is very rarely encountered in real life. Therefore, solutions determined by this method may be unrealistic in practice. To avoid this, one possibility is to introduce uncertainties on parameters in order to better model reality and to try to find solutions able to absorb these perturbations without unreasonably degrading their quality. This kind of approach is usually referred to as robust (Billaut et al. 2013).

In linear programming, several robust approaches have been designed depending on the type of parameters on which the uncertainties fall on. Here, we focus on issues where uncertainties are related to (a ij) parameters. More precisely, we assume that each parameter a ij takes its values in a bounded interval \(\left [\bar {a}_{ij}-\hat {a}_{ij}, \bar {a}_{ij}+\hat {a}_{ij}\right ]\). That is to say that there is a random real variable ζ ij which takes its values in \(\left [-1,1\right ]\) such that

$$\displaystyle \begin{aligned}a_{ij}= \bar{a}_{ij} + \zeta_{ij}\hat{a}_{ij}\end{aligned}$$

Thus, according to these assumptions, a MILP that takes into account these uncertainties can be formalized as follows:

$$\displaystyle \begin{aligned} & \mathrm{Maximize}\ {\sum_{j=1}^n c_jx_j}{} \end{aligned} $$
(6.5)
$$\displaystyle \begin{aligned} & \mathrm{s.t.} \\ & {\sum_{j=1}^{n} \bar{a}_{ij}x_{j} + \sum_{j=1}^{n} \zeta_{ij}\hat{a}_{ij}x_{j} \leq b_i }\quad {\forall i=1,\cdots,m}{} \end{aligned} $$
(6.6)
$$\displaystyle \begin{aligned} & {x_{j} \in \mathbb{N}}\quad {\forall j=1,\cdots,p}{} \end{aligned} $$
(6.7)
$$\displaystyle \begin{aligned} & {x_{j} \in \mathbb{R}_{+},}\quad {\forall j=p+1,\cdots,n}{} \end{aligned} $$
(6.8)

where \(\displaystyle \sum _{j=1}^{n} \zeta _{ij}\hat {a}_{ij}x_{j}\) models the uncertainty in constraint (6.6).

The main idea of robust approaches presented in this chapter is to try to reasonably protect oneself from this uncertainty by taking into account the risk, thanks to a set of deterministic functions \(\left (\beta _{i}^{\varOmega _i}\left (x\right )\right )_{i=1,\cdots ,m}\), where \(\left (\varOmega _i\right )_{i=1,\cdots ,m}\) are parameters tuned in order to meet the degrees \(\left (\varGamma ^{ref}_i\right )_{i=1,\cdots ,m}\) of protection the decision-maker wants to implement, depending on the criticality of the constraint.

In other words, \(\left (\varOmega _i\right )_{i=1,\cdots ,m}\) have to be set up to be sure that the probability that the uncertainty does not exceed \(\beta _{i}^{\varOmega _i}\left (x\right )\) is greater or equal to \(\varGamma ^{ref}_i\), for i = 1, ⋯ , m:

$$\displaystyle \begin{aligned} \mathbb{P} \left[\sum_{j=1}^{n} \zeta_{ij}\hat{a}_{ij}x_{j} \leq \beta_{i}^{\varOmega_i}\left(x\right)\right] \geq \varGamma^{ref}_i,\quad \forall i=1,\cdots, m {} \end{aligned} $$
(6.9)

Thus, if one solution \(\left (\varOmega _i\right )_{i=1,\cdots ,m}\) can be set up such that Eq. (6.9) is satisfied, solving the following optimization problem will ensure to have a solution \(X=\left (x_j\right )_{j=1,\dots ,n}\) which can resist to uncertainty with degrees wanted by the decision-maker:

$$\displaystyle \begin{aligned} & \mathrm{Maximize}\ {\sum_{j=1}^n c_jx_j}{} \end{aligned} $$
(6.10)
$$\displaystyle \begin{aligned} & \mathrm{s.t.} \\ & {\sum_{j=1}^{n} \bar{a}_{ij}x_{j} +\beta_{i}^{\varOmega_i}\left(x\right) \leq b_i }\quad {\forall i=1,\cdots,m}{} \end{aligned} $$
(6.11)
$$\displaystyle \begin{aligned} & {x_{j} \in \mathbb{N}}\quad {\forall j=1,\cdots,p}{} \end{aligned} $$
(6.12)
$$\displaystyle \begin{aligned} & {x_{j} \in \mathbb{R}_{+},}\quad {\forall j=p+1,\cdots,n}{} \end{aligned} $$
(6.13)

In Bertsimas and Sim (2004), the authors propose to use the following set of functions:

$$\displaystyle \begin{aligned} \beta_{i}^{\varOmega_i}\left(x\right) =\max_{ \sum_{j=1}^{n} \left|\zeta_{ij} \right| \leq \varOmega_i } \left( \sum_{j=1}^{n} \zeta_{ij}\hat{a}_{ij}x_{j}\right) \end{aligned} $$
(6.14)

and they prove that this non-linear formulation can be linearized and is equivalent to a MILP. Thus traditional Linear Programming technics can be used for solving the initial problem. This kind of MILP is called a Robust Linear Programming Model.

Nevertheless, tuning \(\varOmega =\left (\varOmega _i\right )_{i=1,\cdots ,m}\) for satisfying (6.9) can be very difficult. In Bertsimas and Sim (2004), the authors show that if for all i, each ζ ij is independent and symmetrically distributed in \(\left [-1,1\right ]\), Ω i can be analytically determined. But, such a hypothesis is not often verified in real industrial problems.

6.2.2 Discrete Event Systems Models for Evaluating Solution Robustness

Industrial systems can be modeled by Discrete Event Systems (DES) that allow a representation of the behavior of a system by considering the state and Event that allow it to evolve. The event is seen as an instantaneous occurrence of an action or phenomenon in the system environment. Changes due to the event can be deterministic when the behavior is known with certainty or stochastic when the occurrence of an event can lead to different states. These modeling tools can be Petri Nets, State Automata, Statecharts, Bayesian Networks (Cassandras and Lafortune 2009).

To model the behavior of industrial systems and perturbations, we should be able to represent many dynamic characteristics such as the communication between elements of the workshop (jobs, resources), the time, and the probabilistic behavior of perturbations. Many stochastic Discrete Event Systems languages allow the modeling of these characteristics. For instance, Stochastic Petri Nets (Chiola et al. 1993), Stochastic Automata (Alur and Dill 1994), Stochastic Automata Networks (Plateau and Atif 1991).

The language chosen here is the Stochastic Timed Automata (STA). In fact, it is an extension of the well-known Timed Automata (Alur and Dill 1994) which is enriched with shared variables, synchronizing Event and probabilistic characteristics (Larsen et al. 1997).

Definition 6.1

Formally a Stochastic Timed Automaton is presented as the following n-tuple:

A = (L, V, E, C, Inv, Pr, T, L m, l 0, v 0) where

  • L is a finite set of locations.

  • V  is a finite set of variables.

  • E is a finite set of synchronizing Event

  • C is a finite set of clocks.

  • Inv is a set of invariants (conditions in location).

  • Pr is a set of probabilities: (i) discrete for the set of transitions (from a location, probabilistic transitions allow to attend different locations l i with a given probability p i, with \(\sum p_{i}=1\)). (ii) Continuous for the variables (the crossing condition of a transition is defined randomly by a probability distribution).

  • T is a finite set of transitions (l, e, g, m, l′) ∈ L × E × G × M × L, where l and l′ are, respectively, the starting and arriving locations. On a transition, three optional elements are defined: (i) a guard (condition on variables) g from the set of guards G, (ii) an update (on variables) m from the set of updates M, and (iii) a synchronizing event e from the set E.

  • L m ⊆ L is the set of marked locations.

  • l 0 ∈ L is the initial location of the automaton.

  • v 0 is the initialization vector of variables.

The elements of a STA can be graphically represented as follows (see example in Fig. 6.1). Locations are represented by vertices and transitions by arcs. An initial active location is represented by a double vertex. The invariants are represented inside the associated vertex (location). Guards are represented between brackets “[ ]”. Synchronizing Event is represented in italics. The update of variables and clocks are represented between parenthesis “( )”. Discrete probabilities are modeled by dotted arcs and associated probabilistic values are underlined. For continuous probabilities, they are directly linked with the definition of variables.

Fig. 6.1
figure 1

A Stochastic Timed Automaton representing a machine

The automaton MACHINE in Fig. 6.1 represents the behavior of a machine that can be subjected to failures. For this purpose, the machine can be represented by three states: Idle, Busy, Failed. Moreover, the failure rate is represented by λ and the repair rate by μ.

Initially, the MACHINE is in the location Idle waiting for the CycleStart event. After the occurrence of this event, the local clock c is initialized (c := 0) and the MACHINE becomes Busy, i.e. is used for executing a cycle (that lasts normally T time units). Before T times units, a failure may occur with a probability λ (the MACHINE reaches the location Failed) or the machine continues its cycle with the probability 1 − λ. In the location Failed, the MACHINE is repaired with the probability μ and, in this case, the cycle restarts to zero (c := 0). In this example, a failure has a big impact because the cycle restarts from zero after being repaired.

Discrete Event systems models can be used either to control, i.e. to inhibit certain state transitions to avoid unwanted behaviors, or to evaluate performance, i.e. to check properties such as the reachability of a state or the execution of an events sequence. This “Model-Checking” ability can be used for evaluating the impact of a perturbation on a system.

Properties that are traditionally desirable for an industrial system concern the reliability, maintainability, and safety. And DES models and tools are usually used for evaluating these properties. For instance, in Morel et al. (2009), Reliability is defined as the ability of a device or system to perform a required function under stated conditions for a specified period of time. This property is often measured by the probability R(t) that a system will operate without failure before time t (depending on the failure rate λ), i.e. the probability that the Time To Failure TTF is greater than the time t:

$$\displaystyle \begin{aligned} R(t) = \mathbb{P}(TTF>t) \end{aligned} $$
(6.15)

Now, we will show how DES models and tools can be used for evaluating reliability. In the example in Fig. 6.1, reliability can be the probability that the Failed state will be never reached before a cycle time T (meaning TTF > T, such that the failure does not happen during the cycle, if not the cycle has to restart from zero). For stochastic DES models, such a property can be expressed in PCTL (Probabilistic Computation Tree Logic). This language is a probabilistic extension of CTL (Computation Tree Logic) (Baier and Kwiatkowska 1998). This type of logic allows to express properties like “What is the probability that the model is in the state Failed, in the precise interval [0,T]?” This question can be transcribed in PCTL as in the expression (6.16).

$$\displaystyle \begin{aligned} P=?[F \leq T \ ``MACHINE.Failed''] {} \end{aligned} $$
(6.16)

where P =? means that we want to assess the probability that the property that is inside the brackets [ ] is reached. This property can be translated as follows:

  • F ≤ T means “There exists in the future in a time that is less or equal to T.”

  • “MACHINE.Failed” means “a state where the stochastic timed automata MACHINE is in the marked location Failed.”

Finally the obtained result assesses 1 − R(T). So Model-Checking can be used for evaluating the reliability of a system. And we could do the same for the maintainability and safety.

The implementation of property verification is done by model-checking. The input of the model-checker is a system model and a property. At the output, the model-checker indicates whether the property is checked and, if not, a counter-example is returned (i.e., an example that shows that the property is not checked). In the case of stochastic system modeling, model-checking can be done numerically or statistically:

  • Numerical model-checking uses accurate valuation methods to determine the probability value of a property. This type of model control ensures the accuracy of the given solution, but it is not suitable for large problems.

  • Statistical model-checking generates different execution paths and verifies, after each execution, the satisfaction of a property. Statistical model-checking is similar to the Monte Carlo simulation. Monte Carlo simulation is a method for estimating a numerical quantity using random numbers. At each simulation step, the expectation of the variable is calculated and the simulation stops when the statistical parameters are satisfied. This avoids the combinatorial explosion and is therefore adapted to check real systems (Ballarini et al. 2011).

Actually, reliability can be seen as a robustness property. The notion of robustness has different definitions in literature that converge to the same idea: a robust system should maintain or guarantee some performances despite perturbations and variations generated by the system or its environment (Billaut et al. 2013).

When considering perturbations modeled as stochastic variables, the concept of “service level” can be used for assessing the robustness (Dauzères-Pérès et al. 2010). A service level \(\mathcal {S}\mathcal {L}\) is defined as the probability that a criterion is smaller (resp. larger) or equal to a given value. Thus, the assessed robustness level \(\mathcal {S}\mathcal {L}\) can be translated as the probability \(\mathbb {P}\) that a system state z is lower than a value z max (or larger than a value z min) as in the following equation:

$$\displaystyle \begin{aligned} \mathcal{S}\mathcal{L} = \mathbb{P}(z\leq z_{max}) {} \end{aligned} $$
(6.17)

We can see that reliability falls within this definition. DES models and tools are thus good candidates for evaluating the robustness level of a system.

6.2.3 Proposed Methodology for Combining the Two Approaches

As said before, fine-tuning the \(\varOmega =\left (\varOmega _i\right )_{i=1,\cdots ,m}\) for satisfying (6.9) can be very difficult in general cases. Therefore, we propose a methodology for iteratively and numerically tuning Ω, thanks to Discrete Event Systems Models and associated Model-Checking tools. Figure 6.2 sketches the proposed approach. Here, we suppose that:

  • the problematic we want to solve and which relies on the system-of-interest can be formalized by a Mixed Integer Linear Programming model,

  • the system-of-interest and its dynamic we want to study can be modeled, thanks to a Discrete Event Systems Model (DES),

  • the decision-maker is able to define the different robustness indicators \(\left (\varGamma ^{ref}_{i}\right )_{i=1,\dots ,m}\) for each constraint i that must be satisfied by the solution X.

Fig. 6.2
figure 2

The proposed methodology

To determine the parameters \(\left ( \varOmega _i \right )_{i=1,\cdots ,m}\) that lead to the robustness level wanted by decision-makers, we proposed a methodology based on the following three iterative modules: Module 1: The Operations Research Module (OR Module). According to the current value of \(\left ( \varOmega _i \right )_{i=1,\cdots ,m}\), the Robust Mixed Integer Programming Model using Bertsimas and Sim’s framework is designed (Bertsimas and Sim 2004). Then, this is input into a Solver to get the optimal solution taking into account the robustness parameters. The obtained solution X is then sent to the second Module. Module 2: The Discrete Event Systems Module (DES Module). Considering the system-of-interest and the solution proposed by the Operations Research module, a Stochastic Timed Automata model is first designed. Then, the different robustness levels Γ i (as instantiations of the service level \(\mathcal {S}\mathcal {L}\)) to be assessed are defined as properties to be checked on the resulting model by a model-checker. The resulting \(\left (\varGamma _{i}\right )_{i=1,\dots ,m}\) are sent to the third module. Module 3: Update Module. Depending on the robustness levels \(\left (\varGamma _{i}\right )_{i=1,\dots ,m}\) assessed by the Discrete Event Systems Module, if the robustness levels required by the decision-maker are reached, then the process is stopped and the solution is given. Otherwise, \(\left ( \varOmega _i \right )_{i=1,\cdots ,m}\) are updated and sent back to the Operations Research Module for a new iteration.

6.3 Application to the Problem of Scheduling Under Perturbations

6.3.1 Scheduling Under Perturbations

The issue of production scheduling is an important decision-making problem in industrial processes. Actually, to guarantee the production performances, the decision-maker has to find an adapted schedule to its production system and the associated constraints. A production scheduling problem consists usually in (1) allocating the workshop resources to operations needed to make the jobs, (2) sequencing the operations on resources (defining the execution order of operations on resources), and (3) eventually defining the starting and ending dates of each operation. The schedule obtained should satisfy the workshop constraints (precedence constraints, non-preemption of operations, etc.). Indeed, each type of workshop has its own constraints in order to satisfy the production objective (like minimizing the total completion time of operations, number of late jobs, production cost, etc.).

Monostori et al. (2016) consider robust scheduling as one of the six main challenges in Research and Development for Cyber-Physical Production Systems. Others like Zhong et al. (2017) prefer to talk about a need of intelligent scheduling able to generate, from captured data, a reliable schedule in real time.

6.3.2 Instantiation of the Approach to Scheduling Under Perturbations

Here, we present an illustration of our methodology applied to a scheduling problem in a production area composed of two non-identical parallel machines: this means that the machines can perform the same operations but with different processing times. Then scheduling problem in this production cell involves both machine allocation and sequencing, rather than simply sequencing (Mokotoff 2001). Figure 6.3 shows the considered production cell.

Fig. 6.3
figure 3

The production area

In our case, we seek to minimize the completion time of the last scheduled job: this criterion is usually called the Makespan and is denoted as C max. The main assumptions of our problem are the following:

  • All jobs are available at time 0,

  • The two machines are always available (no breakdown, …),

  • Processing times for the jobs are independents,

  • A machine cannot process more than one product at any time.

This problem which is referred to as R2||C max has been shown to be NP-hard in the weak sense (Lenstra et al. 1977). Here, we also assume that processing times are not deterministic.

6.3.2.1 Instantiation of the Methodology

The methodology presented in Fig. 6.2 can be instantiated as in Fig. 6.4. The MILP formulation of the problem R2||C max under uncertainties is presented in Sect. 6.3.2.2. The DES Module is presented in Sect. 6.3.2.3. The Update Module is presented in Sect. 6.3.2.4.

Fig. 6.4
figure 4

Instantiated methodology to R2||C max

6.3.2.2 Operations Research Module

First, we give the MILP formulation of this scheduling problem when all the processing times are deterministics.

The parameters of the model are given in Table 6.1.

Table 6.1 Parameters of the model

The decision variables are summarized in Table 6.2.

Table 6.2 Decision variables of the model

The R2||C max problem can be formulated as follows:

$$\displaystyle \begin{aligned} & \mathrm{Minimize}\ {C_{max}}{} \end{aligned} $$
(6.18)
$$\displaystyle \begin{aligned} & \mathrm{s.t.} \\ & {\sum_{k=1}^{2} x_{jk} = 1}\quad {\forall j \in \left\{1,\ldots,N\right\}}{} \end{aligned} $$
(6.19)
$$\displaystyle \begin{aligned} & {C_{max}-\sum_{j=1}^{N} t_{jk}x_{jk} \geq 0}\quad {\forall k \in \left\{1,2\right\}}{} \end{aligned} $$
(6.20)
$$\displaystyle \begin{aligned} & {C_{max}\geq 0}{}{} \end{aligned} $$
(6.21)
$$\displaystyle \begin{aligned} & {x_{jk}\in\left\{0,1\right\}}\quad {\forall \left(j,k\right) \in \left\{1,\ldots,N\right\}\times\left\{1,2\right\}}{} \end{aligned} $$
(6.22)

Equation (6.18) is the objective function we seek to minimize. Equation (6.19) ensures that every job is executed by a single machine. Constraint (6.20) requires that total completion time C max is higher than the completion time on each machine. Equations (6.21) and (6.22) are positivity and integrity constraints.

Now, we suppose that there are some uncertainties related to the jobs’ processing times. As presented in the robustness section, there is a random variable ζ jk which takes its values in \(\left [-1, 1\right ]\) such that

$$\displaystyle \begin{aligned}t_{jk}= \bar{t}_{jk} + \zeta_{jk}\hat{t}_{jk}\end{aligned}$$

According to the Bertsimas and Sim (2004) approach, we can formulate the robust model as follows:

$$\displaystyle \begin{aligned} & \mathrm{Minimize}\ {C_{max}}{} \end{aligned} $$
(6.23)
$$\displaystyle \begin{aligned} & \mathrm{s.t.} \\ & {\sum_{k=1}^{2} x_{jk} = 1}\quad {\forall j \in \left\{1,\ldots,N\right\}}{} \end{aligned} $$
(6.24)
$$\displaystyle \begin{aligned} & {C_{max}-\sum_{j=1}^{N} \bar{t}_{jk}x_{jk}-\max_{ \sum_{j=1}^{N} \left|\zeta_{jk} \right| \leq \varOmega_k } \left( \sum_{j=1}^{N} \zeta_{jk}\hat{t}_{jk}x_{jk}\right)\geq 0}\quad {\forall k \in \left\{1,2\right\}}{} \end{aligned} $$
(6.25)
$$\displaystyle \begin{aligned} & {C_{max}\geq 0}{}{} \end{aligned} $$
(6.26)
$$\displaystyle \begin{aligned} & {x_{jk}\in\left\{0,1\right\}}\quad {\forall \left(j,k\right) \in \left\{1,\ldots,N\right\}\times\left\{1,2\right\}}{} \end{aligned} $$
(6.27)

In this context, Ω k represents the maximal deviation (using the \(\mathcal {L}^1\)-Norm) that is taking into account in the model for each machine k. If Ω k = 0, which means that no uncertainties are taken into account. In fact, the constraints (6.25) become equivalent to the constraints (6.20) and the robust formulation becomes equivalent to the deterministic formulation. On the contrary, if we want to consider all the uncertainties, Ω k must be chosen as equal to N. If it is the case, the most conservative solution will be obtained. In fact, the constraints (6.25) become equivalent to the following:

$$\displaystyle \begin{aligned} C_{max}-\sum_{j=1}^{N} \bar{t}_{jk}x_{jk}- \sum_{j=1}^{N} \hat{t}_{jk}x_{jk}\geq 0,\ \forall k \in \left\{1,2\right\} \end{aligned} $$
(6.28)

Thus, this corresponds to the worst-case formulation: i.e. the most conservative, considering that the worst case (all the ζ jk are equal to 1) is more important than the other cases.

Here, the idea is to fix Ω k (as a “maximum amount of deviation” on the operation durations) but with guaranteeing that the desirable robustness levels \(\varGamma ^{ref}_k\) are reached.

6.3.2.3 Discrete Event Systems Module

This section presents the DES formulation such that the allocation \(X=\left [x_{jk}\right ]_{jk}\) resulting from the solving of the robust MILP formulation given in the previous section can be evaluated regarding its robustness level and the result is sent to the Update Module for updating accordingly \(\left (\varOmega _k\right )_{k=1,2}\) (and a new iteration is launched) or not. The DES module contains two steps: Step 1: Stochastic Timed Automata Model design: using STA for modeling the behavior of jobs and machines when executing the allocation X. Step 2: Robustness evaluation by Model-Checking: evaluating the robustness levels Γ k associated with each machine k.

6.3.2.3.1 Stochastic Timed Automata Model Design

First we propose to model the behavior of the jobs and the machines when they are not subjected to uncertainties. We define a job pattern that will be instantiated for each job j and a machine pattern that will be instantiated for each machine k.

In the following, we denote as k j the machine that is allocated to the job j (defined by X coming from the Operations Research Module). Formally k j =∑kx jk.k.

The job pattern (named α j) is presented in Fig. 6.5a. First, in the Waiting to be executed location, the job j waits the availability of its allocated machine k j (through the guard \(\left [Avail\left (k_j\right )==True\right ]\)). When the guard is satisfied, the job pattern sends a request to the machine pattern (by the synchronizing event \(Request\left (k_j\right )\)) and reaches the In execution location waiting for its completion (the reception of the synchronizing event \(Completed\left (j\right )\)). After its completion, the job reaches the Completed location.

Fig. 6.5
figure 5

STA models of job and machine without considering perturbations. (a) Job STA: α j. (b) Deterministic machine STA

The machine pattern is represented in Fig. 6.5b. In this model, the job that is going to be executed is denoted as j k. First, in the Idle location, the machine waits a request from a job (the synchronizing event \(Request\left (k\right )\)) and then reaches the Busy location after updating its availability status (Avail(k) := False) and initializing the local clock \(t_{{j_k}k}\) to 0. In the Busy location, the machine executes the job until the local clock reaches the deterministic duration \(\bar {t}_{{j_k}k}\). When the duration is reached, the job can be informed of its completion (by the synchronizing event \(Completed\left (j_k\right )\)) and the machine updates its availability status (Avail(k) := True). The machines then go back to the Idle location.

Now, we integrate the uncertainties on the job duration into the machine pattern. The resulting updated machine pattern is represented in Fig. 6.6. In the following, we denote the upper value of the job duration as \(t_{{j_k}k}^{max}=\bar {t}_{{j_k}k} + \hat {t}_{{j_k}k}\) and the lower value of the job duration as \(t_{{j_k}k}^{min}=\bar {t}_{{j_k}k} - \hat {t}_{{j_k}k}\).

Fig. 6.6
figure 6

Perturbed machine STA

In the Busy 1 location, the machine waits to reach the minimal duration \(t_{{j_k}k}^{min}\) (through the guard \(\left [t_{{j_k}k} == t_{{j_k}k}^{min}\right ]\)). Moreover, the iteration counter l is initiated to 0. The idea is to let the duration increase according to a discrete probability p(l) that is evolving depending on the iterations number l. In the Busy 2 location, if the maximal duration is reached (\(\left [t_{{j_k}k} == t_{jk}^{max}\right ]\)), then the machine reaches the completing location. If it is not the case (\(\left [t_{{j_k}k} < t_{jk}^{max}\right ]\)), there are two possible probabilistic choices: (1) with the probability 1 − p(l), the duration can increase and the iteration counter is updated (l := l + 1) or (2) with the probability p(l), the current duration is the final duration.

Finally, the probability that \(t_{{j_k}k} =t_{{j_k}k}^{min} + l\) is the probability to loop into the Busy 2 location l − 1 times and to get out from the loop in the l thx iteration.

Actually, p(l) is a probabilistic parameter that can be calculated from the probability distribution followed by \(t_{{j_k}k}\).

Modeling the execution of the job as previously presented allows to not be restricted to any kind of probability distribution (symmetric or not, discrete or not, etc.). We could even imagine to cut the interval of the job duration in several sub-intervals in which the probability distributions could be different. That makes this approach a good complement to the robust linear programming of the Operations Research module.

6.3.2.3.2 Robustness Evaluation by Model-Checking

In the second step, model-checking tools are used to assess the robustness level of X.

In a scheduling problem, we can instantiate the service level presented in Eq. (6.17) as follows: z is the total completion time despite the considered uncertainties and z max is the referential completion time C max associated with X given by the Operations Research module. So we define the robustness level as the probability that the executed makespan is smaller or equal than the referential completion time C max. Formally, this metric is given by Eq. (6.29):

$$\displaystyle \begin{aligned} \mathcal{S}\mathcal{L}=\mathbb{P}\left(C_{max}\left(X,U\right) \leq C_{max}\right) {} \end{aligned} $$
(6.29)

where \(C_{max}\left (X,U\right )\) is the executed makespan of an allocation X subjected to uncertainties U.

So to assess the value of \(\mathcal {S}\mathcal {L}\) using DES models and associated Model-Checking, the property to check is: “What is the probability that all the paths lead to a global state where all the job models α j are in the marked location Completed in a time that is less or equal to C max?”

Using PCTL, this property can be expressed as follows:

$$\displaystyle \begin{aligned} P=?\left[F \leq C_{max} \ ``\forall j\ \alpha_j.Completed''\right] {} \end{aligned} $$
(6.30)

where P =? means that we want to assess the probability that the property that is inside the brackets [ ] is reached. This property can be translated as follows:

  • F ≤ C max” means “There exists in the future in a time that is less or equal to C max.”

  • “∀j α j.Completed” means “a state where, for all j, all the stochastic timed automata α j are in the marked location Completed.”

That means that the formula \(\left [F \leq C_{max} \ \mbox{``}\forall j\ \alpha _j.Completed\mbox{''}\right ]\) is a PCTL expression for: \(C_{max}\left (X,U\right ) \leq C_{max}\).

Here, two robustness levels associated, respectively, with each machine can be defined. They consist to consider only the uncertainties are only taken into account on machine 1 or machine 2. So, we can evaluate which machine is more sensitive than the other. These two robustness levels are defined as follows:

$$\displaystyle \begin{aligned} \begin{array}{l} \varGamma_1=\mathbb{P}\left(C_{max}\left(X,\left(\hat{t}_{j1}\right)_{j1}\right) \leq C_{max}\right)\\ \varGamma_2=\mathbb{P}\left(C_{max}\left(X,\left(\hat{t}_{j2}\right)_{j2}\right) \leq C_{max}\right) \end{array} {} \end{aligned} $$
(6.31)

where \(\left (\hat {t}_{j1}\right )_{j1}\) (resp. \(\left (\hat {t}_{j2}\right )_{j2}\)) are the uncertainties on the operation durations when considering that there are no uncertainties on the machine 2 (resp. 1). Finally, these robustness levels assess whether the inequation (6.9) is satisfied or not. This result is used in the Update Module for updating or not Ω.

Moreover, we are able to evaluate a general robustness level considering the global uncertainties \(\left (\hat {t}_{jk}\right )_{jk}\) as follows:

$$\displaystyle \begin{aligned} \varGamma=\mathbb{P}\left(C_{max}\left(X,\left(\hat{t}_{jk}\right)_{jk}\right) \leq C_{max}\right) \end{aligned} $$
(6.32)

As the two machines are independent, we have Γ = Γ 1 × Γ 2.

6.3.2.4 Update Module

6.3.2.4.1 Update of Ω

Here we assumed that the decision-maker is able to fix a robustness level Γ ref he would like to be achieved by the system. This robustness level assessed the minimal acceptable probability that the executed makespan is smaller than the reference makespan C max. Moreover, we considered that:

  • the machine are independent: \(\varGamma ^{ref}=\varGamma ^{ref}_1 \times \varGamma ^{ref}_2\)

  • the contributions of each machine to the global robustness level are equivalent (no machine is more critical than the other).

Thus, \(\varGamma ^{ref}_k\) (defined in the inequation (6.9)) can be fixed as follows:

$$\displaystyle \begin{aligned}\forall k \in \{1,2\}, \varGamma^{ref}_k = \sqrt{\varGamma^{ref}}\end{aligned}$$

Following the assessment of Γ, Γ 1, and Γ 2 by the Discrete Event Systems Module, the following algebraic distances to the required minimal robustness level Γ ref, \(\varGamma _1^{ref}\), and \(\varGamma _2^{ref}\) can thus be calculated:

$$\displaystyle \begin{aligned} \begin{array}{rcl} D=\varGamma^{ref}-\varGamma \end{array} \end{aligned} $$
(6.33)
$$\displaystyle \begin{aligned} \begin{array}{rcl} D1=\varGamma^{ref}_1-\varGamma_1 \end{array} \end{aligned} $$
(6.34)
$$\displaystyle \begin{aligned} \begin{array}{rcl} D2=\varGamma^{ref}_2 - \varGamma_2 \end{array} \end{aligned} $$
(6.35)

If D1 > 0 or D 2 > 0, which means that the required robustness levels are not reached and the parameters Ω 1 and Ω 2 have to be updated. We propose to do it as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varOmega_1=\varOmega_1 + D_1 \end{array} \end{aligned} $$
(6.36)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \varOmega_2=\varOmega_2 + D_2 \end{array} \end{aligned} $$
(6.37)

We can note that these update formulas are arbitrarily defined. However, they express the fact that the further away from the objective (the bigger D k), the more the parameters Ω k must be amplified.

6.4 Application

In the application, 10 jobs are considered, with execution times having the uncertainties defined in Table 6.3. Moreover, Γ ref is fixed to 0.90: meaning that the probability that the executed makespan will be effectively less than or equal to the optimal value is at least equal to 0.90.

Table 6.3 Characteristics of jobs

Table 6.4 gives the different iterations of the combined approach. We started with Ω = [0, 0] (meaning that no uncertainty is considered). Two solutions are explored during different iterations. The solution X 1 allocates the first machine to jobs 1, 4, 6, 7, 8 and the second machine to jobs 2, 3, 5, 9, 10. The solution X 2 allocates the first machine to jobs 1, 4, 6, 7, 8, 10 and the second machine to jobs 2, 3, 5, 9.

Table 6.4 Iterations for the application

This application shows that combining the two approaches allows to converge to a solution with a good robustness level without degrading too much the makespan. Initially (without perturbations), the makespan was of 12 and the associated robustness level was of 0.65. If the decision-maker accepts to degrade this makespan of around 20% (increasing the makespan to 14), then the robustness level reaches 0.90. Moreover, this approach is a good means for tuning the Ω parameters even if the probability distribution associated with the uncertainties is not symmetrical. We can note here that the makespan for a robustness level of 1 is C max = 18 (namely the most conservative solution). Thus, the couple \(\left (C_{max}=13.98, \varGamma =0.9\right )\) is a good compromise between optimality and robustness.

6.5 Conclusions

Among the major issues related to Industry 4.0, risk management and, consequently, robust decision support play a significant role in the concerns of decision-makers.

In order to provide an efficient answer to this problem, we have proposed a generic method combining robust mathematical programming and Discrete Event Systems models. This allows to reach the level of robustness desired by the decision-maker by finely assessing the degree of robustness of the solutions provided by the optimization module, regardless of the probability distributions that follow the uncertainties on the model input data. We have illustrated the latter on the case of a scheduling problem with parallel machines.

However, as far as our methodology is generic, it will have to be adapted to the context of use. In particular, the mechanism for updating robustness coefficients Ω can be designed more efficiently to increase the rate of convergence of the methodology towards a solution with the required robustness level. In addition, instead of considering an equidistribution of the levels of robustness to be obtained over all the constraints of the model, a more specific distribution of these can be considered, taking into account, for example, the configuration of the production system, the criticality of certain machines (for instance, requiring greater robustness for bottleneck machines, etc.).

With the development of Industry 4.0 and, more particularly, the increasing use of digital twins, these hybridization between decision support and performance evaluation models are likely to develop. The advent of Big Data and its consequences in terms of model calibration (and in particular through a more realistic estimation of probability laws modeling data uncertainties) combined with ever-increasing computing power will make it possible to implement this type of methodology in decision support tools in an industrial context.