1 Introduction

Flexible manufacturing systems (FMS) consist of a finite set of operations and resources and can process different types of parts based on a prescribed sequence of operations. In an FMS, there is some amount of flexibility that makes it able to respond to varying conditions. The purpose of this work is to solve scheduling problems for flexible and smart FMS used by the industry of the future. Nowadays production with large series of identical products is no longer competitive. Consumers ask for personalized products. Consequently, the great challenge of Industry 4.0 is to offer personalized products, and despite low manufacturing volumes, to maintain gains. The consumers also like to communicate with the processes and the machines during the production phases. The concept of industry 4.0 or industry of the future corresponds to such new organizations of the means of production called "smart production". Smart production requires, for example, to give flexibility to some operations (and to maintain precedence constraints to other ones) as shown in Fig. 1: in the first job, the controller could start by M1 then M2 or by M2 then M1. Smart production also requires to use the same machines or servers to achieve different operations: in Fig. 1, M1 is used for an operation with a duration of 60 s in the first job and for another operation (different from the previous one) with a duration of 100 s in the second job.

Fig. 1
figure 1

Example of an FMS organization in Industry 4.0

In this context, a particular class of FMS that combines operations with total precedence constraints and operations with full routing flexibility (namely hybrid FMS) is proposed in this work.

Petri nets (PN) have been widely used for the design, modelling and analysing of FMS with concurrency, synchronization, sequencing and sharing of resources. The optimization of a specific cost function is a main objective of many scheduling problems aiming to allocate a limited number of resources within several operations. However, the scheduling problem of complex FMS is NP-hard and the computational time to obtain an optimal schedule grows exponentially with respect to the problem size. The problem becomes even more challenging in the context of Industry 4.0. Thus, a large literature has been devoted to such optimization problems (Cassandras 1993; Lee and DiCesare 1994; Luo et al. 2015). In this paper, we develop first a systematic structured formalism that is suitable to represent hybrid FMS from the industry of the future. Then, based on the obtained model and on a new cost function that is proved to be a lower bound of the true makespan, a variant of the Beam Search method is detailed to compute sequences of minimal duration by exploring selectively the PN state space. The approach should also avoid deadlocks and dead branches that are a priori unknown for the controller.

The paper is organized as follows. Section 2 is about the state of the art and gives the position of the paper with respect to the usual organizations of FMS and the control issues of scheduling problems. In Sect. 3, we present timed Petri nets (TPN) and Sect. 4 details the method used to model the hybrid jobs with transition-timed Petri nets (T-TPN). In Sect. 5, the approach used for scheduling control of hybrid FMS is described. Finally, computational experiments and results are exposed in Sect. 6.

2 State of the Art

PN have been widely used to model scheduling problems for discrete event systems (DES). Liu and Barkaoui (2016) have listed and discussed the subclasses of PN used for this purpose. In this section, we remind some usual organizations of FMS and the use of PN for scheduling problems.

2.1 Job Shop Modelling

A job shop is a manufacturing process structure in which a set of jobs are processed on multiple resources with total precedence constraints. In a job shop, called also multi-path workshop, each job is composed of a sequence of operations performed according to a well-defined order a priori fixed. Each operation is processed on a specific resource. In order to model the job shop scheduling problem, a subclass of ordinary and conservative PN referred to as S3PR (System of Simple Sequential Processes with Resources) has been defined (Ezpeleta et al. 1995; Xing et al. 1996).

Later, in order to extend the use of resources in job shop, a generalization of S3PR model referred to as S4R (Systems of Sequential Systems with Shared Resources) has been defined by Barkaoui and Ben Abdallah (1996). This extension exceeds the limitation of the use of resources and concerns especially the modelling of concurrently cyclic sequential processes sharing common resources: in S3PR, only one resource is allowed per operation which is not the case for S4R where the use of resources is almost arbitrary and only requires conservativeness.

2.2 Open Shop Modelling

An open shop is a workshop with full routing flexibility where a set of jobs are processed on multiple resources without any precedence constraints. In order to model the open shop scheduling problem, the S3PR nets could be used. The problem is that due to complexity issues, this representation becomes quickly difficult to handle especially for large systems. Therefore, the use of S3PR leads to an exponential explosion of the size of the net with regard to the number of operations because there are n! possible routes for n operations of a single job. An extension of S3PR, referred to as S2OPR (Set of Simple Open Processes with Resources), has been proposed to overcome the problem (Mejía et al. 2017). This extension is a compact representation for the open shop problem, and the net is much smaller compared to its equivalent S3PR model, but it needs to be 1-bounded.

2.3 Scheduling Control

PN are largely used for control issues of DES (Cassandras 1993). They are also a popular graphical and mathematical tool used to handle the scheduling problems with routing flexibility in the presence of shared resources. The key concept is to explore selectively the state space of the PN model with the intention of finding a firing sequence, with the minimum duration. The firing sequence is computed by optimizing a specific cost function.

A first approach concerns global methods where the whole reachability graph is explored (Lefebvre and Daoui 2018; Lefebvre 2018). This approach leads surely to optimal solutions, but the problem is the exponential explosion of the computational time and memory especially for large systems. Consequently, the search space grows exponentially with problem scale and makes the scheduling method applicable to small systems only. To overcome the limitation of global methods, a large literature has been devoted in order to search solutions with weak complexity.

The A* search algorithm is a very well-known algorithm used to accelerate the search of solutions. This technique expands only the most promising branches of a search tree. Lee and DiCesare (1994) proposed an algorithm which combines the Petri net reachability graph and the A* to schedule FMS. However, the result leads to an explosion of the state space of the net (Pearl 1984). To improve the exploration, researchers developed several improvements on it, for example best-first strategy with controlled backtracking (Xiong and Zhou 1998; Lefebvre 2016), dynamic programing (Xiong et al. 2014), hybrid A* search in order to relax the evaluation scope (Lee and Lee 2010), dynamic window search algorithm in which a policy is applied to select the most promising paths (Reyes et al. 2002), A* algorithm with depth-first strategy (Huang et al. 2008), iterative deepening A* with backtracking (Baruwa et al. 2015) and more informed heuristics (Liu et al. 2014; Moro et al. 2000; Huang et al. 2014; Mejía and Niño 2017).

Contrary to the A* algorithm where the previously explored candidates during search are all conserved, an informed graph search referred to as Beam Search (BS) algorithm (Ow and Morton 1988) conserves only the best β previously explored candidates. The complexity in time and memory remains polynomial thanks to this restriction, but the main drawback is that promising candidates could be eventually eliminated due to the selection method. Several variants of the Beam Search algorithm in combination with Petri nets have been proposed (Mejía and Odrey 2005; Mejía and Montoya 2009; Luo et al. 2015). However, for complex FMS, these improvements remain either insufficient or inapplicable to reach the optimal scheduling.

3 Timed Petri Nets

A PN structure is defined as <P, T, WPR, WPO>, where P = {p1,…, pm} is a set of m places and T = {t1,…, tq} is a set of q transitions. WPO ∈ (N)m×q and WPR ∈ (N)m×q are the post- and pre-incidence matrices (N is the set of non-negative integer numbers), and W = WPOWPR ∈ (Z)m×q (Z is the set of positive and negative integer numbers) is the incidence matrix. For any node x ∈ TP, x• stands for the postset of x and •x stands for the preset of x. < G, M0 > is a PN system with initial marking M0 and M ∈ (N)m represents the PN marking vector. For each marking M, P(M) ⊆ P is defined as the subset of non-empty places at marking M (i.e. the support of M). A transition tj is enabled at M if there are enough tokens in the preset of tj: ∀ pi ∈ P, M(pi) ≥ WPR(pi,tj) where M(pi) stands for the number of tokens in place pi and WPR(pi,tj) is the element of matrix WPR at row i and column j. The firing of a transition t at marking M removes WPR(p, t) tokens from each place p in •t and adds WPO(p, t) tokens to each place p in t•. This is written as M [t > M’ (Ezpeleta et al. 1995; Mejía et al. 2017) where M’ is the resulting marking when the enabled transition t fires at marking M.

There are several classes of TPN, in particular T-TPN, that associate a firing duration to each transition in the net (Ramchandani 1974), place-timed Petri net (P-TPN) that associate a sojourn duration to each place of the net (Sifakis 1979) and time PN that associate a time interval with each transitions (Merlin 1974). In this paper, T-TPN are used to represent the time required to perform the operations.

A T-TPN is a 6-tuple N =  < P, T, WPR, WPO, D, M0 > where < P, T, WPR, WPO, M0 > is a marked PN, and D: T → R+ is a firing time function that assigns a positive real delay D(t) to each transition t of the net. A transition t may fire at earliest after the minimal delay D(t) from the date it has been enabled. The residual firing time is defined for each transition tj at each marking M. If a transition tj becomes enabled at time τenabled(tj), it will fire at earliest at time τenabled(tj) + D(tj). In this paper, the residual firing time RFT(tj) ∈ [0, D(tj)] of transition tj is defined as the minimum value between the non-negative difference between the current time τ and the enabled time τenabled(tj) and the minimal delay D(tj): RFT(tj) = min(ττenabled(tj), D(tj)).

A timed firing sequence σ of length | σ |= K and of duration τK is defined as σ = t(τ1)t(τ2)…t(τK) where τ1,…, τK represent the times of the firings that satisfy 0 ≤ τ1 ≤ τ2 ≤ … ≤ τK. The timed firing sequence σ fired at M leads to the timed trajectory (σ, M) = M(0) [t(τ1) > M(1)…. M(− 1) [t(τK) > M(K) with M(0) = M. The notation M [σ > M(K) denotes that there exists such a valid timed firing sequence σ from marking M to marking M(K).

In a PN structure, a Path is an orderly succession of K nodes x1 x2xK with xk ∈ TP such as xk+1 ∈ xk• for k = 1,…, − 1. The duration of Path is defined as:

$$ \chi \left( {{{Path}}} \right) = \sum\nolimits_{{t \in {{Path}}}} {D\left( t \right)} $$
(1)

A Path is said of minimal duration if there exists no other path from x1 to xK with a smaller path duration and we refer to the path of minimal duration from x1 to xN as to Path*(x1, xN). We refer to the duration of Path*(x1, xN) as to χ*(x1, xN).

4 Complex FMS Modelling

A FMS is a system composed of several jobs and resources. Each job is composed by a set of operations which perform on the resources. Usual FMS are job shops and open shops but more general organizations could be considered as hybrid FMS studied in this work. Let J and R be, respectively, a set of jobs and a set of resources. Each job J ∈ J consists of a set of operations OJ = {oi, i ∈ {1,…,n}}, where oi is the operation i of job J. The set of all operations is denoted O =  ∪ {OJ, J ∈ J}. Note that for simplicity the reference of a given operation oi to the corresponding job J is not explicitly reported, but since each operation oi belongs to a single job J there is no confusion.

An operation oi is defined by a duration di representing the time required to be performed and a subset of resources Ri ⊆ R required for its execution: oi = (di, Ri). An operation oi is modelled using one place pi and one timed transition toi so-called operation transition with D(toi) = di as shown in Fig. 2. Resource places are connected to the operation transition toi by means of self-loops: when oi is executed, the set of resources Ri needed is reserved from the enabling time to the firing time of toi.

Fig. 2
figure 2

T-TPN model of an operation with a single resource

4.1 T-TPN Model of a Sequence of Operations

T-TPN are used to model sequences of operations. In the following, we assume for simplicity that the capacities of the jobs of the FMS are limited to 1. The minimal duration of a single execution of the job J composed by a sequence of operations and without shared resource is given by (2):

$$ D\left( J \right) = \sum\nolimits_{{{{to}} \in J}} {D\left( {{{to}}} \right)} $$
(2)

An example of a cyclic sequence of two operations with a minimal duration D(J) = d1 + d2 is shown in Fig. 3. The operations of the job are ordered, the job processes o1 that requires the resource r1 and then o2 that requires r2.

Fig. 3
figure 3

T-TPN model of a sequence of operations

When the capacity is equal to 1, only one token can pass to perform the job and the others will stand on the start place sJ (s1 in the example of Fig. 3), waiting for the first token to finish all the operations of the job (o1 and o2 in the example of Fig. 3). When the first token passes to perform the job, the capacity becomes 0 (there is no token in the place cp1 of this example) and no more token in sJ can pass. Then, once the token has passed through all the operations of the job, the capacity returns to 1 and another token from sJ can be used to perform the job.

4.2 T-TPN Model of a Set of Operations with Full Routing Flexibility

T-TPN are also used to model a set of operations performed with full routing flexibility. The open shop could be modelled using S3PR or S2OPR. A comparison of sizes of both the S3PR and the equivalent S2OPR has been done (Mejía et al. 2017): for large systems where operations are performed with full routing flexibility, the S2OPR net is much smaller than its equivalent S3PR model.

A controller composed by flag and counter places is used with S2OPR (Mejía et al. 2017). For each job with full routing flexibility, a set of flag places PF = {fi, oi ∈ OJ} is connected to the postset of the start transition ts to ensure that the system executes each operation only once. In addition, a single counter place c is used to ensure that the system executes all operations. An example of two operations with full routing flexibility is shown in Fig. 4. The S3PR model is shown in Fig. 4a while its equivalent S2OPR model is shown in Fig. 4b. Operations of the job are not ordered: the job can process o1 that requires the resource r1 then o2 that requires r2 or o2 then o1. For S2OPR model, the controller is composed of the flag places f1 and f2 and the counter place c12.

Fig. 4
figure 4

T-TPN model of a set of operations with full routing flexibility

A set of operations performed with full routing flexibility without shared resource has a minimal duration D(J) computed with (2). The minimal duration of the example shown in Fig. 4 is D(J) = d1 + d2.

4.3 T-TPN Model of Hybrid Jobs

A FMS could include more complex organizations (and not only job shops and open shops) with partial routing flexibility. A particular class of complex shops, referred to as hybrid jobs, which satisfy some specific organization properties, is considered next. The organization of a hybrid job could be iteratively decomposed in a set of basic jobs. Such jobs could be either sequences of operations performed with total precedence constraints or sets of operations performed with full routing flexibility. The systematic design method introduced in our previous work (Cherif et al. 2018) to encode the iterated organization is improved here to be more efficient for the modelling of complex workshops. The modelling is based on some basic functions.

Operation (di, Ri) is the function used to model the operation oi. The duration di of the operation and the set of needed resources Ri are given as inputs. It returns the object “operation” which may be considered as an elementary job.

Sequential (J1, J2,…,Jh) is the function used to model a sequence of h jobs (eventually elementary jobs) given as inputs. It returns the resulting job “sequence” which is modelled as a sequence of J1 to Jh.

Open (J1, J2,…,Jh) is the function used to model a set of h jobs (eventually elementary jobs) with full routing flexibility. It returns the resulting job “open” as the set of jobs J1 to Jh.

In addition, the operations in the hybrid job could share some resources.

Example:

An example of hybrid FMS is described below. The hybrid FMS has two jobs J1 and J2 with 9 resources including 4 shared resources r2, r3, r7 and r9. J1 consists of the sequence of operations o1, o2, o3 and o4. J2 is a hybrid job composed of five operations o5, o6, o7, o8 and o9. Figure 5 shows the organization of the FMS according to the three basic functions previously described. The T-TPN model of this hybrid FMS is presented in Fig. 6.

Fig. 5
figure 5

Hybrid FMS organization

Fig. 6
figure 6

Model of T-TPN model of the hybrid FMS

5 Beam Search for Scheduling Control

In this paper, an algorithm which is inspired from Hybrid Filtered Beam Search (HFBS) presented in (Mejía and Niño 2017) is proposed. The HFBS combines the main principles of the Beam Search (Sabuncuoglu and Bayiz 1999), the Filtered Beam Search (FBS) (Ow and Morton 1988) and the Beam A* Search (Mejía and Montoya 2009; Mejía and Odrey 2005). The idea of HFBS algorithm is to provide diversification at early steps and intensification at late steps of the search. Indeed, the algorithm aims to select multiple candidates from different branches at the first steps but only fewer candidates at the later steps. The algorithm depends on two parameters: a global filter parameter βg which presents the maximum number of expanded candidates at one iteration of the algorithm (global beam width) and a local filter parameter βl which presents the maximum number of expanded successors from one given parent node (local beam width). The search algorithm starts with an initial marking that is the initial population. The search is based on iterations: at the first iteration, the algorithm generates the successors of the initial population and selects the best βg candidates, and a new population is created. Then at each iteration, it expands the population obtained at the previous iteration and creates a new population composed of the best βg candidates. The algorithm terminates in two cases: it may stop when the reference marking is reached or when all candidates have been expanded. One drawback is the selection of the best parameter combination (βg and βl) for a given instance which is not obvious even though large values of βg and βl do not ensure to result in the optimal solution but certainly to expand unpromising candidates. However, when the selection of the nodes is good, a near optimal solution or even an optimal solution can be obtained. The selection of candidates that will be expanded is based on a heuristic cost function which is defined next. The priority of expansion is given to the candidates that have the lowest values of the cost function.

5.1 Objective Function

In this section, a new cost function is defined to be relevant to complex FMS which consist of hybrid jobs where operations are performed which partial precedence constraints. The objective function calculates a value used to select the candidates to be expanded at each iteration of the search. Consider an initial marking M0 and a reference marking Mref such that one or more trajectories exist between M0 and Mref. Imagine that (σ, M0) is one of these trajectories and that (σ, M0) is incrementally computed. For each intermediate marking M of (σ, M0), the firing sequence σ is divided into two parts according to σ = σ1 σ2. The first part corresponds to the already computed trajectory (σ1, M0) from M0 to M (i.e., M0 [σ1 > M), and the second part corresponds to the residual trajectory (σ2, M) with an unknown sequence σ2 from M to Mref. The following cost function is reformulated as:

$$ f\left( {M_{\it{0}} ,\sigma_{\it{1}} ,M_{{{{ref}}}} } \right) = g\left( {\sigma_{\it{1}} ,M_{\it{0}} } \right) + h\left( {M,M_{{{{ref}}}} } \right) $$
(3)

where M is defined as M0 [σ1 > M.

For the trajectory (σ1,M0) already computed, a systematic algorithm computes the duration g(σ1,M0) of 1,M0) by transforming the untimed trajectory into a timed trajectory under earliest firing policy (i.e. each transition t in σ1 fires as soon as its time constraint D(t) is satisfied). The algorithm, detailed in (Lefebvre 2017a, b), updates at each new marking M the remaining durations of the enabled transitions. The estimation of the residual time to the reference is simply the sum of durations of unperformed operations which does not take into account the availability of resources. Indeed, to estimate the residual time with the function h(M,Mref), a reduced PN model Nr =  < Pr, T, WrPR, WrPO, D > is first introduced where Pr = P \ R is a set of mr places where resource places are removed and T is conserved as a set of q transitions. WrPO ∈ (N)mr×q and WrPR ∈ (N) mr×q are the reduced post- and pre-incidence matrices, and Wr = WrPOWrPR ∈ (Z) mr×q is the reduced incidence matrix. The marking of the reduced PN is referred to as Mr.

For a sequence J of operations (for example, a job J in job shop problems), several estimation functions hJ(M, Mref) of the residual duration of job J have been proposed: in Luo et al. (2015), hJ is based on the search of resource and operational places; in (Lefebvre 2017a), hJ is based on the residual firing count vector. In this work, we propose an estimation function hJ based on the search of the shortest paths from the marked places to a given reference place pref that is added as the end place to the global model of the FMS (Mejía and Nino 2017; Lefebvre and Mejía 2018). The objective function is improved here and adapted to deal with the modelling of the new organization of the FMS. The estimation hJ is defined depending on the type of the job. For a set of operations with total precedence constraints, the approximation hJ(M, Mref) of the cost of the unknown part of the trajectory is given by (4):

$$ h_{J} \left( {M,M_{{{\text{ref}}}} } \right) = M_{r} \left( {s_{J} } \right)D\left( J \right) + \max \{ \forall p_{i} \in P\left( {M_{r} } \right)\backslash \left\{ {s_{J} } \right\},{\chi^{*}\left( {p_{i} ,e_{J} } \right)}{-}{\text{RFT}}\left( {{{to}}_{i} } \right){\text{with}}\;{{to}}_{i} \in \left( {p_{i} } \right) \bullet\} $$
(4)

where Mr = Reduce(M) is the reduced marking obtained from M by removing capacity and resource places, sJ is the unique start place of the job J, eJ is the unique end place of the job J and RFT(toi) is a correction term that corresponds to the residual firing time of the first transition toi of the path of minimal cost between pi and eJ. In simple words, hJ(M, Mref) corresponds to the sum of the durations of operations that are not yet performed corrected according to the operation that is currently in progress: the first part of Eq. (4) refers to the product of the time needed to perform the job D(J) by the number of tokens in the start place Mr(sJ) (that should wait the end of the current execution of the job), and the second part refers to the maximum of the shortest paths from the marked places pi to the reference place eJ. One can notice that the reduced model of a set of operations with total precedence constraints is a single path with minimal duration between the unique start place sJ and the unique end place eJ. As long as the capacity of the job is 1, only the start place sJ could be marked with more than one token and there exists at most one place pi ∈ J\{sJ} such that M(pi) = 1. Consequently, Eq. (4) can be reformulated with (5):

$$ h_{J} \left( {M,M_{{{{ref}}}} } \right) = M\left( {s_{J} } \right) \times D\left( J \right) + {\chi^{*}\left( {p_{i} ,e_{J} } \right)}{-}{{RFT}}\left( {{{to}}_{i} } \right)\;\;\;\;{\text{with}}\;{{to}}_{i} \in \left( {p_{i} } \right) \bullet $$
(5)

The estimation function hJ(M, Mref) of the residual duration for a job J composed of a set of operations with full routing flexibility is also obtained as the corrected sum of the durations of operations that are not yet performed. Such a duration is no longer computed according to the paths of the PN structure but depends on the sum of the operation durations.

In a set of operations with full flexibility, the status of each operation oi is defined by an operation place pi and a flag fi. Three cases may occur: (1) the flag place fi is marked and the operation place pi is unmarked, and then the operation is not yet performed; (2) the operation place pi is marked and the flag place fi is unmarked, and then the operation is currently performing; (3) the flag place fi and the operation place pi are both unmarked, and then the operation oi is already performed.

Note that the flag place and the operation place of a given operation could not be marked together. Thus, the approximation hJ(M, Mref) of the duration of the unknown part of the trajectory, which is given by the sum of the operations duration that are not yet performed, is equal to the product of the flag place marking M(fi) by operation duration D(toi) plus the product of the operation place marking M(pi) by the operation duration D(toi). Observe that a correction term is added for the operation that is currently performing. The residual time to Mref is estimated by (6):

$$ h_{J} \left( {M,M_{ref} } \right) = M\left( {s_{J} } \right) \times D\left( J \right) + \mathop \sum \limits_{{f_{i} \in {\text{PF}}}} M\left( {f_{i} } \right) \times D\left( {to}_{i} \right) + \left( {M\left( {p_{i} } \right) \times D\left( {to_{i} } \right){-}{\text{RFT}}\left( {to}_{i} \right)} \right)\;\;\;\;{\text{with}}\;{to}_{i} \in \left( {p_{i} } \right) \bullet $$
(6)

where PF = {fi, oi ∈ OJ}is the set of flag places.

To calculate the residual duration for a hybrid job, we use the iterated organization presented in Sect. 4 with a set of basic jobs and we reformulate the whole model as a layered PN with L different levels. The idea, we propose here, is to introduce dynamic transitions (that will be represented with small rectangles in the next figures) and dynamic places (that will be represented with double circles in the next figures). Dynamic transitions have variable firing times in order to propagate the residual times from one layer to the next one. Dynamic places report the whole marking of a given basic job from one layer to another one. The combined use of dynamic transitions and places embeds the models at levels 1,…, − 1 into the dynamic transitions and dynamic places at level l: the marking and residual duration of each basic job at level 1,…, − 1 are propagated at level l to change, respectively, the marking of the dynamic places and the firing time of the dynamic transitions. The other places and transitions of the model will be referred to as static places and transitions.

Let J be a given job and Jl ={J1l,..,Jkl} be the set of basic jobs at level l and {1,..,L} be the set of levels. For any given marking M, the evaluation of hJ(M,Mref) is calculated as below:

At level 1, all places and transitions are static ones. The marking of the basic jobs is a simple copy of the PN marking and the firing duration D(to) of any timed transition to is equal to the duration of operation o.

At level l > 1, the marking of the static places is a copy of the PN marking the firing duration D(to) of a static transition to is equal to the duration of operation o. On the contrary, the marking of the dynamic places reports the number of tokens in the basic job Jk’l’ at level l’ < l, and the firing duration D(to) of a dynamic transition to is equal to the residual duration hJ,k’l’(M,Mref) of the basic job Jk’l’ at level l’ < l. The job Jk’l’ is defined by the FMS organization.

The estimation hJ(M, Mref) of the residual duration for the whole hybrid job J is obtained from the value hJL, obtained for the unique basic job at level L.

Example:

Consider the hybrid job J2 in Fig. 7 with 3 levels and 4 basic jobs {J11, J21, J12, J13}. The residual durations for the basic jobs J11 and J21 of the first level are estimated using (5). Then the time parameters d11 and d21 of the dynamic transitions {to11, to21, ti1121, ti2111} of the basic job J12 at level 2 are updated. The marking of the dynamic places pi11 and pi21 is also updated. Then, the residual duration for J12 is calculated using (6). The time parameter d12 of the dynamic transition to12 and the marking of the dynamic place pi21 in the basic job J13 at level 3 are updated. Thereafter, the estimation of the residual duration for J13 is evaluated using (5) and the residual duration of the hybrid job is obtained as hJ(M, Mref) = h13(M, Mref).

Fig. 7
figure 7

Propagation of the marking and residual duration for a hybrid job

The residual duration for the FMS is obtained as the maximum value of the residual durations over all hybrid jobs of the FMS:

$$ h\left( {M,M_{{{{ref}}}} } \right) = \max \left\{ {h_{J} \left( {M,M_{{{{ref}}}} } \right) \text{with}\,J \in {\varvec{J}}} \right\} $$
(7)

In order to guarantee that the algorithm can find an optimal solution, h(M, Mref) must be admissible (Nilsson 1980) for any reachable marking M, i.e. h(M, Mref) ≤ h*(M, Mref), where h*(M, Mref) is the actual minimum time from M to Mref under earliest firing policy.

Proposition: Let us consider a hybrid FMS. The estimation h(M, Mref) computed with (7) is a lower bound of the actual duration of the residual trajectory (σ2, M) from M to Mref (i.e. is admissible).

Proof:

According to the iterated organization, previously introduced, each job J of the FMS is described with one or several basic jobs Jkl organized in L levels and a set of resources that are shared by the operations. Jkl is either a simple operation, or a sequence of jobs or a set of jobs with full routing flexibility.

At level 1, for a sequence of operations Jk1 with a capacity of 1 and without any resource, the estimation of the residual duration is given by (5). This estimation is the actual duration. When resources are considered, this duration increases if some resources become unavailable. Thus, the true duration is at least equal to (5). The reasoning is similar for a set of operations Jk1 with full routing flexibility. The actual duration is at least equal to (6) because an additional waiting time should be considered when the resources become unavailable.

At level l > 1, the reasoning is quite the same, but sequences of jobs should be considered instead of sequences of operations (resp. sets of jobs with full routing flexibility instead of sets of operations). From the marking M and the computation of the estimation hJ,k’l’(M, Mref) of the residual duration for the basic jobs Jk’l’ with l’ < l, the markings of the dynamic places and the firing durations of the dynamic transitions of basic job Jkl are first updated and then the estimation hJ,kl(M, Mref) of the residual duration for the basic job Jkl is computed. This estimation is equal to the actual duration when resources are not considered. Adding shared resources increases the actual duration but does not change the value of hJ,kl(M, Mref). Propagating the estimations hJ,kl(M, Mref) up to level L leads to the estimation hJL(M, Mref) for job J that is a lower bound of the actual duration required to execute the job J. Finally, (7) is a lower bound of the actual duration required to execute all jobs in the FMS.

5.2 Generation Beam Search Algorithm

In the next, we call generation the new population composed by all direct successors (candidates generated from all the parents of one generation) from a given population of parents. With usual FBS, the comparison is made between the successors of one parent and all other parents. Since the selection of candidates is based on the objective function composed of the cost of the already computed trajectory and the estimation of the residual one which is a lower bound of the real cost, parents have more chance to be selected.

In this section a modified Beam Search algorithm, based on the notion of generation, is proposed in order to give equal chance to all candidates issued from the same generation to be selected. For each parent node, the algorithm starts by exploring their successors, then sorting them according to the total cost f to place the best βl candidates in a temporary list. Once the expansion of all parent nodes is done, the algorithm selects the best βg candidates from the temporary list to create a new generation. This routine is iterated according to the successive generations. The main innovation of the proposed algorithm is that the global sorting is made after expansion of all parent nodes which is not the case for the usual HFBS where the global sorting is made after the expansion of only the best parent. We refer to this variant of the FBS algorithm as Generation Filtered Beam Search (GFBS) (Cherif et al. 2019).

The objective is to explore selectively the PN state space according to the successive generations of candidates in order to reach the reference marking Mref starting from an initial marking M0 with a trajectory of minimal duration. The idea is to provide diversification at early steps and intensification at late steps while maintaining equal chances when selecting the candidates of a given generation. We present below the pseudo-code of the GFBS algorithm. The algorithm aims to return a valid sequence of transition firings Seq and the resulting cost Cmax. The algorithm uses a list, called OPEN, to store to-be-expanded candidates of a given generation. The cost function f detailed in Sect. 5.1 is used to sort candidates of one generation for selection purpose. The list OPEN is updated after each iteration (an iteration is the expansion of all candidates of a given generation).

figure a

This algorithm avoids implicitly deadlocks and dead branches that are a priori unknown for the controller: an infinite cost is associated with the candidates which could not achieve the objective marking Mref (deadlocks and dead branches). Consequently, these candidates are not selected for the next generation because their cost is higher comparing to other candidates.

6 Computational Experiments and Results

This section is devoted to illustrate the use of GFBS algorithm. The algorithm runs on a 1.8 GHz computer with 16G RAM memory. The first example is the hybrid FMS presented in Fig. 6 and aims to illustrate the proposed method based on the iterated organization. Then, two sets of instances are taken from the literature in order to compare the GFBS to other methods.

6.1 Details of the GFBS Algorithm on a Hybrid FMS

In this section, the GFBS algorithm is performed on the FMS modelled in Fig. 6. The processing times and resources required by the operations are given in Table 1.

Table 1 Operation processing times in TU and resources for the FMS in Fig. 6

The capacity of the jobs and the resources equal to 1: M0(rk) = M0(cpJ) = 1, J = 1,2, k = 1,..,9. The initial marking is such that M0(sJ) = 1. The reference marking is such that Mref(sJ) = 0 and Mref(cpJ) = Mref(rk) = 1. With parameters βg = 3 and βl = 2 the algorithm returns a makespan: Cmax = 21.

The details of the exploration are given in Fig. 8: the selected candidates at each iteration (the new generation) are shown as the already computed sequence (seq), the duration of the already computed sequence (g) and the estimation to the reference (h). The candidates in each population are generated with firing sequences that have an equal number of transitions, and the selection is made on the best value of g + h cost function. All candidates of a given generation have different markings (if two candidates have the same marking only the one with the minimal cost is saved for the future generations). The idea is to improve the diversification by giving other candidates, with different markings, the chance to be selected. In order to reach the reference and obtain the optimal result, 16 successive iterations were computed as shown in Fig. 8. The obtained makespan is optimal, and this has been validated by applying a global method (Lefebvre and Daoui 2018; Lefebvre 2018). This example illustrates that the proposed method is suitable for scheduling problems with hybrid FMS. Observe that the successive generations are of maximal size 3 and that no more than 2 candidates are generated from each parent.

Fig. 8
figure 8

Details of the exploration for the hybrid FMS of Fig. 6

6.2 Qualitative Comparison with Other Methods

The second example is taken from (Muth and Thompson 1963; Crowston et al. 1963). The scheduling problem is composed of six jobs, and each has to be processed on six resources with total precedence constraints. This well-known benchmark referred to as Ft06 was also tested in numerous papers based on the branch and bound methods (Carlier and Pinson 1989; Perregaard and Clausen 1998; AitZai and Boudhar 2013; Dabah et al. 2016), the genetic algorithms (Asadzadeh and Zamanifar 2010; Koblasa and Kloud 2011; Goncalves and Resende 2014), the taboo search (Watson et al. 2003; Kuo-Ling and Ching-Jong 2008; Peng et al. 2015), the Simulated Annealing Algorithm with Cooperative Threads (SACT) (Cruz-Chávez et al. 2019) and the Particle Swarm Optimization (PSO) (Huang et al. 2019). The details of the problems are given in Table 2. The range of processing times on each resource, shown in Table 3, was from one to ten days. The objective is to perform one part for each job of the FMS.

Table 2 Operation processing times in TU and resources for Ft06 benchmark
Table 3 Comparison of methods and performances for Ft06 benchmark

The optimal solution as given in the literature is Cmax = 55 days. Using parameters Bg = 20 and Bl = 2, our approach gives results as good as the ones taken from the literature (i.e., Cmax = 55 days). In addition, the GFBS does not need more than 1 s, which matches also the literature results. This example also illustrates the large variety of existing approaches studied for such kind of scheduling problems.

6.3 Performance Comparison with Other Methods

The third example presented in Han et al. (2015) and used in Mejia and Nino (2017) consists of 20 instances. The FMS is composed of 2 jobs and 4 resources. The example is originally modelled as a P-TPN, and we generate its equivalent T-TPN by replacing each timed place by two places and one timed transition (Sifakis 1979) as shown in Fig. 9.

Fig. 9
figure 9

From P-TPN to T-TPN

The P-TPN and its equivalent T-TPN are presented in Fig. 10. Note that the size of the generated T-TPN with 22 places and 17 transitions is bigger than its equivalent P-TPN with 15 places and 10 transitions. Consequently, the size of the reachability graph of the T-TPN model is also bigger than the size of the reachability graph of the P-TPN model. This penalizes our method compared to other methods that are based on P-TPN models.

Fig. 10
figure 10

Model of second set of instances

The processing times of operations are d11 = 25, d12 = 23, d13 = 27, d14 = 25, d21 = 25, d22 = 25 and d23 = 25. Twenty instances with different resource capacities and job sizes are considered.

The instances are denoted by FMS01-FMS20 as shown in Table 4. The GFBS algorithm is performed on these instances with different values of the parameters βg in the range [20:100] and different values of the parameters βl in the range [2:10]. The obtained results are given in Table 4.

Table 4 20 instances of the first set

The results are compared with the Hybrid Particle Swarm Optimization algorithm (HPSO) used in (Han et al. 2015) and the Iterated Hybrid Filtered Beam Search (IHFBS) used in Mejia and Nino (2017). HPSO algorithms used the deadlock controller proposed by (Piroddi et al. 2008). The IHFBS is based on the repeated call of the search algorithm where the value of the objective function of the current call is used as upper bound for the next call.

The comparison with HPSO algorithm can be made from the perspective of the memory and the computational time: in term of memory, the search space of T-TPN model used with GFBS algorithm is much bigger than its equivalents P-TPN used with HPSO.

Contrary to GFBS, the HPSO algorithm uses deadlock controllers which further reduce the search space when it avoids deadlocks and dead branches. In addition, for HPSO the search stops only if a predefined maximum running time is reached. Indeed, even using a processor with higher performances (2.6 GHz computer with 4G RAM memory), the running time of HPSO algorithm is ranged from 500 s on FMS1 to 5000 s on FMS20. For GFBS algorithm, the running time is ranged from 1 s on FMS1 to 80 s on FMS20.

Although its complexity in time and memory, GFBS algorithm found better solutions in 10 out of 20 instances and matches the solutions of the 10 remaining instances.

The comparison with IHFBS algorithm leads to the following comments: in terms of memory, the IHFBS algorithm uses, as well as HPSO, the P-TPN model which is less complicated than its equivalent T-TPN used with GFBS. In addition, contrary to GFBS algorithm which is performed only once, IHFBS is based on repeated call of the search algorithm where the value of the objective function of the current call is used as an upper bound for the next call. Thus, the number of explored candidates is much bigger which influences the computational time needed to perform the algorithm. GFBS algorithm does not need more than 1 s to obtain the result for the smallest instance FMS1 while the time needed for the largest instance FMS20 was 80 s. For IHFBS, even using a processor with higher performances (3.2 GHz computer with 8G RAM memory), the running times ranged from 40 s on FMS1 to 1600 s on FMS20. Although its complexity in time and memory, GFBS algorithm found the same results for 16 of 20 instances. In the remaining 4 instances, the results obtained with IHFBS are slightly better.

Note that we have reported the CPU times as mentioned in the literature but that the processors used were different from our processor: GFBS runs on 1.8 GHz computer, HPSO runs on a 2.6 GHz computer (Han et al. 2015) and IHFBS runs on a 3.2 GHz computer (Mejia and Nino 2017).

Consequently, definitive conclusions about the comparison of the time complexity are difficult to draw. Nevertheless, the values reported in Table 4 illustrate that the time complexity of the approaches developed in (Han et al. 2015) and (Mejia and Nino 2017) exceeds widely the time required by GFBS even if the frequency of the used processor was larger.

7 Conclusion

In this paper, we have studied the modelling of hybrid FMS with partial routing flexibility and solve scheduling problems for such FMS. First, we have developed a T-TPN modelling methodology for FMS where some operations are proceeded with total precedence constraints and others with full routing flexibility. Second, a new scheduling method, which incrementally computes control sequences for hybrid FMS, was presented. The method uses T-TPN, an iterated organization of the operations for hybrid jobs and a new variant of beam search methods that selectively explores the PN state space according to successive generations of candidates. The cost function used by the GFBS method was proved to provide a lower bound of the trajectory duration. A set of instances has been detailed in order to illustrate that the GFBS algorithm is suitable for a large variety of FMS organizations, including job shops, open shops and hybrid FMS.

In our future works, we will introduce, uncertain environments by considering unreliable resources and failures of operations for hybrid FMS. The cost function will be updated in order to take into account the risk of deviation for the computed trajectory. Adding to that, models with more complicated time constraints (for example, maximal time constraints with time Petri nets (Merlin 1974)) will be considered.