1 Introduction

Horn et al. (2017) introduced the Job Sequencing with One Common and Multiple Secondary Resources (JSOCMSR) problem, which considers the scenario of scheduling a set of jobs where each job, for part of its execution, requires a common resource and, for the whole processing time, requires a secondary resource that is only shared with a subset of the other jobs. The goal of the JSOCMSR is to find a feasible schedule minimizing the makespan.

Applications of this problem can be found, for example, in manufacturing when the common resource corresponds to a machine on which the jobs are to be processed and the processing of each job also requires a certain accompanying resource like a casting mold, template etc. The important aspect is that there is a setup time (e.g. a preparation of the casting mold) at which the secondary resource is needed before the job can be executed at the machine and also a postprocessing time during which the secondary resource is needed afterwards (e.g. because the produced goods must be cooled in the casting mold). With job-dependent setup, processing and postprocessing times and a limited number of shared secondary resources, the JSOCMSR has been shown to be NP-hard (Horn et al. 2017).

A more specific application of this problem can be found in the daily scheduling of cancer patients that are to receive particle therapy (Maschler et al. 2016). There, the common resource corresponds to a sophisticated particle accelerator, which accelerates proton or carbon particles to almost the speed of light. This particle beam is directed into one of a small set of treatment rooms where one patient can be radiated at a time. The treatment rooms are here the secondary resources. During the setup time, a patient is positioned and possibly sedated and after the actual radiation with the particle beam, typically some medical examinations are to be done before the patient can leave the room and it becomes available for a next patient. In such particle therapy treatment centers, there is usually only a single particle accelerator because of its huge cost and about two to three treatment rooms. Since these treatment rooms are typically individually equipped for handling different kinds of treatments, the assignment of patients to rooms is pre-specified. Ideally, patients are scheduled in such a way that the particle beam is directly switched from one room to another so that patients are radiated without any significant breaks in between.

However, the JSOCMSR is in most cases only a strongly simplified model of real-world scenarios like the above patient scheduling. Most notably, the jobs start times are in practice frequently constrained to certain time windows arising from the availability of the underlying resources. Furthermore, in practice, it happens frequently that not all jobs can be scheduled due to these time windows and instead, a best subset of jobs that can be feasibly scheduled must be selected.

To also include such aspects is the focus of the current work: We extend the JSOCMSR by considering job-specific time windows, and instead of minimizing the makespan we aim at finding a feasible solution that maximizes the total prize of all scheduled jobs. To this end, each job has an assigned prize, which can simply take the value one if we want to maximize the number of scheduled jobs or it can take a value representing the priority of the job. Another possibility is that these prizes are correlated to the processing times of the jobs to avoid scheduling primarily short jobs.

These new aspects have a substantial impact on the structure of the problem and consequently on the algorithmic side, and existing methods for the JSOCMSR cannot be extended in efficient ways. Most importantly, the rather effective lower bound calculation for the makespan in the JSOCMSR we proposed in Horn et al. (2017) is useless for the new problem variant due to the entirely different objective function and new decision variables. Therefore we propose here a new A* search to solve this prize-collecting variant of the JSOCMSR to proven optimality. In particular, we investigate four different upper bound calculations for partial solutions which are used as heuristic functions to estimate the costs-to-go within the A* search. A further aspect of our A* search is that there is no specific target state known in advance.

An early version of this approach was already presented in our conference paper (Horn et al. 2018b). The current article extends this work by primarily considering an additional application scenario and corresponding new benchmark instances: pre-runtime scheduling of electronics within an aircraft, called avionics. The instances we include here are inspired by a sub-structure of the problem introduced in Blikstad et al. (2018), and the focus here is the part of the structure that coincides with that of the patient scheduling problem. We further remark that we fixed a bug in our implementation after the publication of Horn et al. (2018b) and consequently reran all experiments; while the general trends remain the very same, detailed numbers have changed.

In Blikstad et al. (2018), the scheduling of an industrially relevant avionic system is addressed. The considered system consists of a set of nodes and each of these contains a set of modules (processors) with jobs to be scheduled. Each node consists of one communication module, corresponding to the common resource, and a set of application modules, corresponding to the set of secondary resources. There are three types of jobs: partition jobs, communication jobs and regular jobs. Partition jobs, which are executed on the application modules run the system’s software applications. Each of these jobs has to communicate with the communication module and we consider only the case where a partition job has to communicate either at the beginning or at the end of its execution. Typically, the processing time of partition jobs is long compared to other jobs and its usage of the common resource is short compared to the total processing time of the job.

The communication is handled by communication jobs and regular jobs, both executed on the communication module. These have short processing times and in order to model that they use the common resource only, an additional artificial secondary resource is included for these jobs. The communication and regular jobs use both the common resource and the artificial secondary resource for their whole processing time. Communication jobs and regular jobs together handle different kinds of communication (system external, inter- and intranode) and the communication jobs have the specific purpose of representing jobs used for sending communication messages (Blikstad et al. 2018). For this reason, the communication jobs can only be scheduled at specific time slots where communication messages can be sent and the length of a communication job’s time window is equal to the job’s total processing time. This distinguish them from the regular jobs which have time windows with similar characteristics as those of the partition jobs.

To mimic the situation of creating a partial schedule for a node in the system, only a part of the total length of a schedule is considered and the tasks available exceed what is possible to include in the partial schedule. The prize of a job reflects the individual importance of including this job in the partial schedule. Note that compared to Blikstad et al. (2018), some simplifications are made with respect to the types of jobs included and by omitting precedence relations between jobs. Furthermore, we do not explicitly and fully consider the scheduling of the communication network used for communication between the nodes.

After a more formal problem definition in the next section and a survey of related work in Sect. 3, we describe the A* search in Sect. 4. This method relies on a specific state strengthening procedure and the use of a good heuristic guidance function corresponding to an upper bound calculation for the total prize that may still be a achieved from a partial solution onward. Concerning the latter, we investigate different possibilities based on linear programming and Lagrangian relaxations of a multidimensional knapsack problem formulation. For comparison purposes, we further consider, in Sect. 5, an order-based Mixed Integer Programming (MIP) model solved by Gurobi Optimizer Version 7.5.1 and, in Sect. 6, a constraint programming model implemented in MiniZinc. Section 7 presents and compares computational results for all these approaches on instances of the avionics scenario as well as balanced and skewed instances of the particle therapy scenario. The results show that the proposed A* search can solve substantially larger instances to optimality in shorter times than the other methods. Section 8 concludes this work with an outline of promising future research directions.

2 Problem formulation

The Prize-Collecting Job Sequencing with One Common and Multiple Secondary Resources with Time Windows (PC-JSOCMSR) considers the sequencing of a set of jobs where each job needs to respect resource constraints and time windows. The resource constraints refer both to a common resource that is used by all jobs and a set of secondary resources of which each job uses exactly one. It is assumed that it is usually not possible to find a feasible schedule that includes all jobs; instead each job is associated with a prize and the objective is to choose a subset of the jobs such that the sum of prizes of the sequenced jobs is maximized.

Let the set of all jobs be denoted by J, with \(|J|=n\), and let the prize of job j be \(z_j>0\), \(j \in J\). The problem is to find a subset of jobs \(S \subseteq J\) that can be feasibly scheduled so that the total prize of these jobs is maximized:

$$\begin{aligned} Z^*= \max _{S \subseteq J} Z(S) = \max _{S \subseteq J} \sum _{j \in S} z_j. \end{aligned}$$
(1)

The set of (renewable) resources is denoted by \(R_0= \{0\} \cup R\), with \(R=\{1,\ldots ,m\}\). During its execution, job j uses resource 0, referred to as the common resource, and one of the secondary resources \(q_j\in R\), \(j\in J\). Let \(p_j > 0\) be the processing time of job j, during which it fully requires the secondary resource \(q_j\), \(j\in J\). Further, let \(J_r = \{j \in J \mid q_j = r\}\) be the set of jobs that require resource r, \(r\in R\). For job j, \(j\in J\), the use of the common resource begins \(p^{\mathrm {pre}}_j \ge 0\) time units after the start of the job, has a duration of \(p^0_j\), and ends \(p^{\mathrm {post}}_j = p_j- p^{\mathrm {pre}}_j - p^0_j \ge 0\) time units before the end of the job.

If a job j is scheduled, it must be performed without preemption and within one of its \(\omega _j\) disjunctive time windows \(W_j=\{W_{jw} \mid w=0,\ldots ,\omega _j\}\) with \(W_{jw}=[W^\mathrm {start}_{jw},W^\mathrm {end}_{jw}]\), where \(W^\mathrm {end}_{jw}-W^\mathrm {start}_{j,w}\ge p_j\), \(j\in J\). We assume that each job has at least one time window. For job j, let the release time be \(T^\mathrm {rel}_j=\min _{w=0,\ldots ,\omega _j} W^\mathrm {start}_{jw}\) and the deadline be \(T^\mathrm {dead}_j=\max _{w=0,\ldots ,\omega _j} W^\mathrm {end}_{jw}\). The overall time interval to consider is then \(\left[ T^\mathrm {min},T^\mathrm {max}\right] \) with \(T^\mathrm {min}=\min _{j\in J} T^\mathrm {rel}_j\) and \(T^\mathrm {max}=\max _{j\in J} T^\mathrm {dead}_j\). Note that the existence of unavailability periods of resources is also covered by the above formulation since these can be translated into time windows of the jobs.

To simplify the consideration of the time windows of a job, we define the function earliest feasible time

$$\begin{aligned} \mathrm {eft}(j,t) = \min (\{ t'\ge t \mid \exists w: \left[ t',t'+p_j\right] \subseteq W_{jw}\} \cup \{T^\mathrm {max}\}), \end{aligned}$$
(2)

that yields the earliest time not smaller than the provided time \(t\le T^\mathrm {max}\) at which job j can be scheduled according to the time windows \(W_j\), \(j \in J\). Hereby, \(\mathrm {eft}(j,t)=T^\mathrm {max}\) indicates that job j cannot be feasibly included in the schedule anymore.

Since each job requires resource 0 and only one job can use this resource at a time, a solution to PC-JSOCMSR implies a total ordering of the scheduled jobs S. Vice versa, a permutation \(\pi =(\pi _i)_{i=1,\ldots ,|S|}\) of a subset of jobs \(S\subseteq J\) that can be feasibly scheduled can be decoded into a feasible schedule in a straight-forward greedy way by, in the order given by \(\pi \), placing each job from S at its earliest feasible time with respect to when the resources are available after being used by all its preceding jobs. A schedule derived from a job permutation \(\pi \) in this way is referred to as a normalized schedule. Note that if this greedy approach is applied to a permutation of jobs and some job cannot be feasibly scheduled in this way, this permutation does not correspond to a feasible solution. Also, an optimal solution is either a normalized schedule or the order of the jobs in this optimal solution can be used to derive a normalized schedule with the same objective value. For this reason the notation \(Z(\pi )\) is used for the total prize of the normalized solution given by the job permutation \(\pi \). Figure 1 in Sect. 4.1 shows an example of an instance with four jobs and two secondary resources where each job has exactly one time window and a corresponding optimal solution.

Fig. 1
figure 1

A PC-JSOCMSR instance with \(n=4\) jobs and \(m=2\) secondary resources. Each job has exactly one time window in \(W_j\). The optimal solution is given by the job sequence \(\pi =(2,3,4)\) and its normalized schedule visualized on the right side. The white region in each job denotes the part where, in addition to the job’s specific secondary resource, also the common resource is needed. Job 1 cannot be additionally scheduled due to the time windows and resource requirements. The solution’s total prize is \(Z(\pi )=9\)

It is not difficult to see that the PC-JSOCMSR is NP-hard: The decision variant of the JSOCMSR, which looks for a feasible schedule with a makespan not exceeding a given M, has already been shown to be NP-hard in Horn et al. (2017). One can reduce this decision problem to the PC-JSOCMSR in polynomial time by setting all time windows to \(W_j=\{[0,M]\}\) and all prizes to \(z_j = 1\). A solution to the JSOCMSR decision problem exists if and only if a solution to the PC-JSOCMSR can be found that has all jobs scheduled.

3 Related work

The JSOCMSR was originally proposed by Horn et al. (2017). There, a greedy heuristic, an A* algorithm, and a position-based MIP model are described. A particular contribution of that work is a method for calculating a relatively tight lower bound for the makespan, given a partial solution and the still open jobs. Experimental results show that thanks to this strong bound, already the greedy heuristic yields relatively good results by quickly deriving solutions with optimality gaps of only a few percent. Further, the A* search is capable of solving instances with up to 1000 jobs to proven optimality. In comparison, the presented MIP approach was not competitive and can only solve instances with up to 20 jobs reasonably well and it then requires substantially longer computation times. The lower bound calculation from this previous work is of no use for our PC-JSOCMSR due to the difference in objective function, the fact that usually not all jobs can be scheduled and the time windows. Note further that it is also not efficiently possible or straightforward to adapt the position based MIP model from Horn et al. (2017) to the PC-JSOCMSR due to the time windows.

Concerning the PC-JSOCMSR, Maschler and Raidl (2018) investigated heuristic methods based on multivalued decision diagrams (MDDs) and general variable neighborhood search for addressing larger problem instances with up to 300 jobs, which are so far clearly out of reach to be solved to proven optimality. Furthermore, Horn et al. (2018a) extends the works of Horn et al. (2018b) and Maschler and Raidl (2018) by proposing a novel construction method for relaxed MDDs. These relaxed MDDs are then further used to substantially speed up a heuristic search. In this way, new state-of-the-art results could be obtained for instances with up to 500 jobs. While all these works concentrated on heuristic solution approaches, the current work focuses on solving small to medium-sized instances exactly.

To the best of our knowledge, there are only a few further publications dealing with scenarios similar to the (PC-)JSOCMSR. Van der Veen et al. (1998) considers a setup comparable to JSOCMSR with jobs requiring one common resource and individual secondary resources. However, in their case the postprocessing times are negligible compared to the total processing times of the jobs. This implies that the start time of each job only depends on its immediate predecessor. This simplifies the situation substantially, since a job j requiring a different resource than its predecessor \(j'\) can always be started after a setup time only depending on job j, while a job requiring the same resource can always be started after a postprocessing time only depending on job \(j'\). Due to these properties, this problem can be interpreted as a Traveling Salesman Problem (TSP) with a special cost structure. Van der Veen et al. even show that their problem can be solved efficiently in time \(O(n\log n)\).

Somehow related to the JSOCMSR are no-wait flowshop problem variants, see Allahverdi (2016) for a survey. Each job needs to be processed on each of m machines in the same order and the processing of the job on a successive machine always has to take place immediately after its processing has finished on the preceding machine. This problem can be solved in time \(O(n\log n)\) for two machines via a transformation to a specially structured TSP (Gilmore and Gomory 1964). In contrast, for three and more machines the problem is NP-hard, although it can still be transformed into a specially structured TSP. Röck (1984) proved that the problem is strongly NP-hard for three machines by a reduction from the 3D-matching problem.

Furthermore, the JSOCMSR problem can be modeled as a more general Resource-Constrained Project Scheduling Problem with maximal time lags by splitting each job according to the resource usage into three sub-jobs that must be executed sequentially without any time lags; see Hartmann and Briskorn (2010) for a survey. Such an indirect solution approach, however, is unlikely to yield promising results in practice since problem-specific aspects are not exploited.

Moreover the PC-JSOCMSR is to some extent related to the well studied Orienteering Problem (OP), which essentially also combines the tasks of selecting a subset yielding a maximum prize with finding a sequence of the selected elements that make the solution feasible. Different variants of the OP have been studied, including OPs with time windows; for a survey see Gunawan et al. (2016). In the OP, each node is associated with a prize and travel times are known between all pairs of nodes. The task is to find a path from a given start node to an end node within a given time budget such that the total prize of the visited nodes is maximized. Due to the time budget not all nodes can be visited. In such an OP, the arrival time at a visited node only depends on the immediate predecessor, possible time windows, and the (constant) travel time between these two nodes. For PC-JSOCMSR, the situation is more complicated due to the secondary resources: A much earlier scheduled job requiring the same secondary resource may impact the earliest starting time of the job to be scheduled next. The PC-JSOCMSR, however, also does not generalize the OP with time windows as pairwise travel times are not covered by the PC-JSOCMSR.

Concerning particle therapy patient scheduling, the PC-JSOCMSR is a practically relevant improvement over the simpler JSOCMSR model, but it still is a strongly simplified formulation addressing only certain properties of the real-life problem. In the full practical scenario, many more aspects must be considered such as large time horizons of several weeks, sequences of therapies for patients to be treated, additionally needed resources including medical staff and their availability time windows, and a combination of more advanced objectives and diverse soft constraints. Maschler et al. (2016) proposed a greedy construction heuristic which is extended towards an Iterated Greedy metaheuristic and a Greedy Randomized Adaptive Search Procedure (GRASP), which consider more of these advanced aspects. These approaches treat the whole problem as a bi-level optimization problem in which the upper level is concerned with the assignment of treatments to days and the lower level corresponds to the detailed scheduling of the treatments assigned at each day. The Iterated Greedy metaheuristic was further refined by including an improved makespan estimation for the daily scheduling problems and by considering additional soft constraints for unwanted deviations of start times of the individual treatments for each therapy (Maschler et al. 2017, 2018).

Concerning A*, we point out that it is a well-known and prominent method for finding shortest paths in possibly huge graphs and more generally an informed search method for problem-solving, see Hart et al. (1968), Rios and Chaimowicz (2010). Whereas the classical A* algorithm follows a best-first-search strategy with the goal to find a proven optimal solution as quickly as possible, anytime A* algorithms also producing promising heuristic solutions on a more regular basis from the beginning on; see, for example, the Anytime Weighted A* algorithm (Hansen and Zhou 2007), the Anytime Repairing A* (Likhachev et al. 2004), Anytime Pack Search (Vadlamudi et al. 2016), and the A*/Beam Search hybrid described for the JSOCMSR (Horn et al. 2017).

4 An A* Algorithm for the PC-JSOCMSR

The A* algorithm is a classic search algorithm from the field of path planning on possibly huge graphs (Hart et al. 1968; Rios and Chaimowicz 2010). In this section, we describe our A* search approach to solve the PC-JSOCMSR problem. The method either yields a proven optimal solution or, in case of an early termination, a heuristic solution together with an upper bound to the optimal solution value. We start by describing the state graph on which the search is performed, continue with the framework of our A* algorithm, and focus in Sects. 4.3 and 4.4 on the strengthening of obtained states and propose different possibilities to determine upper bounds for the achievable total prizes of partial solutions, respectively.

4.1 State graph

We consider a weighted directed acyclic state graph \(G=(V,A)\) where each node in V represents a unique state (Pt) consisting of

  • the set \(P\subseteq J\) of jobs that are still available to be scheduled in further steps, and

  • the vector \(t=(t_r)_{r\in R_0}\) of the earliest times from which each of the resources are available for performing a next job.

The initial (root) state is \({\mathbf {r}}= (J,(T^\mathrm {min},\ldots ,T^\mathrm {min}))\) and represents the original PC-JSOCMSR problem instance with no jobs scheduled or excluded yet.

An arc \((u,v)\in A\) represents a transition from a state \(u=(P,t)\) to a state \(v=(P',t')\) that is achieved by scheduling a job j, \(j\in P\), at its earliest possible time w.r.t. vector t. More precisely, the start time of job j, \(j\in P\), w.r.t. state (Pt) is

$$\begin{aligned} \mathrm {s}((P,t),j) = \mathrm {eft}(j,\max (t_0-p^{\mathrm {pre}}_j, t_{q_j})). \end{aligned}$$
(3)

The transition function to obtain the successor state \((P',t')\) of state (Pt) when scheduling job j, \(j\in P\), next is

$$\begin{aligned} \tau ((P,t),j) = {\left\{ \begin{array}{ll} (P\setminus \{j\}, t'), &{}\quad \text{ if }\,\, \mathrm {s}((P,t),j) < T^\mathrm {max}, \\ {\hat{0}}, &{}\quad \text{ else }, \end{array}\right. } \end{aligned}$$
(4)

with

$$\begin{aligned} t'_0&= \mathrm {s}((P,t),j)+p^{\mathrm {pre}}_j+p^0_j, \end{aligned}$$
(5)
$$\begin{aligned} t'_r&= \mathrm {s}((P,t),j)+p_j, \quad \text{ for } r=q_j, \end{aligned}$$
(6)
$$\begin{aligned} t'_r&= t_r, \quad \text{ for } r\in R\setminus \{q_j\}, \end{aligned}$$
(7)

where \({\hat{0}}\) represents the infeasible state. The prize associated with a state transition is the prize \(z_j\) of the scheduled job j. Thus, each path in this state graph originating in the initial state \({\mathbf {r}}\) and ending in some other state than \({\hat{0}}\) corresponds to a feasible solution in which the jobs associated with the arcs are greedily scheduled in the order in which they appear in the path. Note that a feasible state \((P,t)\in V\) may, in general, be reached via multiple different paths, i.e., by including different sets of jobs S and/or by different orderings of these jobs. Therefore, a feasible state does, in general, not represent a unique solution. As we want to find a solution with maximum total prize, we are primarily interested in a path from \({\mathbf {r}}\) to (Pt) with maximum total prize. Let \(Z^\mathrm {lp}(P,t)\) be this maximum total prize to reach a feasible state (Pt). In order to solve the PC-JSOCMSR we are looking for a feasible state with the maximum \(Z^\mathrm {lp}(P,t)\). To find such a state we perform the A* search detailed in the following subsection. Figure 2 shows as example the state graph for the problem instance in Fig. 1.

Fig. 2
figure 2

State graph for the problem instance from Fig. 1 with \(n=4\) jobs and \(m=2\) secondary resources. Node labels denote the corresponding states (Pt), arc labels the scheduled jobs j. The path that represents the optimal solution is highlighted. Note that for the shown state graph the strengthening of states as described in Sect. 4.3 has been applied

4.2 A* algorithm framework

In order to solve the PC-JSOCMSR we have to find a feasible state (Pt) with maximum \(Z^\mathrm {lp}(P,t)\). Such a state cannot have any feasible successor, hence, either \(P=\emptyset \) or \(\tau ((P,t), j) = {\hat{0}}\), \(j \in P\), and otherwise \(Z^\mathrm {lp}(P,t)\) is not the maximum achievable prize.

A* search belongs to the class of informed search strategies that make use of a heuristic estimate for guidance in order to return a proven optimal solution possibly faster than a more naive uninformed search like breadth- or depth-first-search. Within our A* search, each encountered feasible state (Pt) of the state graph is evaluated by a priority function \(f(P,t) = Z^\mathrm {lp}(P,t) + Z^\mathrm {ub}(P,t)\) in which \(Z^\mathrm {lp}(P,t)\) corresponds to the prize of the so far best (i.e., longest) known path from \({\mathbf {r}}\) to state (Pt) and \(Z^\mathrm {ub}(P,t)\) is the A* search’s heuristic function estimating the “costs to go”. The latter is in our case an upper bound on the still achievable prize by extending the path from state (Pt) onward. The value of the priority function f(Pt) is thus an upper bound on the total prize that a solution may achieve by considering state (Pt). Different options for how to compute this upper bound will be investigated in Sect. 4.4.

Our A* search is shown in pseudo-code in Algorithm 1. It maintains the set W of all so far encountered states, implemented by a hash table; for each contained state (Pt), this data structure also stores the values \(Z^\mathrm {lp}(P,t)\) and \(Z^\mathrm {ub}(P,t)\) as well as a reference \(\mathrm {pred}(P,t)\) to the predecessor state of a currently longest path from \({\mathbf {r}}\) to (Pt), and the last scheduled job \(\mathrm {j}(P,t)\). Furthermore, the algorithm maintains the open list Q, which contains all states queued for further expansion. It is realized by a priority queue data structure that considers the states’ priority values f(Pt). Last but not least, our A* search maintains a reference \(x_\mathrm {maxlp}\) to the encountered state (Pt) with the so far largest \(Z^\mathrm {lp}(P,t)\) value. Both W and Q, as well as \(x_\mathrm {maxlp}\), are initialized with the initial state \({\mathbf {r}}\).

figure a

At each major iteration, a state (Pt) with maximum priority value f(Pt) is taken from Q. This state (Pt) is then expanded, which means that each job in P is considered as next job to be scheduled by calculating the respective successor state obtained by the transition \((P',t') = \tau ((P,t),j)\). If a job yields the infeasible state \({\hat{0}}\), it is skipped. Similarly, if the obtained state has been encountered before and the new path via state (Pt) is not longer than the previously identified path to \((P',t')\), we skip job j. Otherwise, if a new feasible state is reached, it is added to set W. Since a new longest path to state \((P',t')\) via (Pt) has been found, line 20 sets \(Z^\mathrm {lp}(P',t')=Z^\mathrm {lp}(P,t)+z_j\) and stores (Pt) as predecessor of \((P',t')\) and job j as the last scheduled job. The upper bound \(Z^\mathrm {ub}(P',t')\) is then also calculated, and if there is potential for further improvement, i.e., \(Z^\mathrm {ub}(P',t')>0\), state \((P',t')\) is added to the open list Q for a possible future expansion. Finally, reference \(x_\mathrm {maxlp}\) is updated if a new overall longest path is obtained.

A special aspect of our A* search therefore is that we do not have a specific target state that is known in advance, and we only add states that may yield further successor states to the open list. Lines 6 to 10 make sure that we nevertheless recognize when a proven optimal solution has been reached: This is the case when either the open list Q becomes empty or the priority value f(Pt) of Q’s top element (Pt) (i.e., the maximum priority value) is not larger than the length \(Z^\mathrm {lp}(x_\mathrm {maxlp})\) of the so far longest identified path. Note that the priority value of Q’s top element always is a valid overall upper bound for the total achievable prize. This follows from the fact that \(Z^\mathrm {ub}(P,t)\) is an admissible heuristic according to Hart et al. (1968), i.e., it never underestimates the real prize that can still be achieved from (Pt) onward. Further, an optimal solution is derived from state \(x_\mathrm {maxlp}\) by following its chain of predecessor states and respectively scheduled jobs, and the corresponding solution is returned.

A particular feature of our A* search is that it can also be terminated early by providing a time or memory limit for the execution and it still yields a heuristic solution together with an upper bound on the optimal solution value in this case. This heuristic solution is derived from the so far best state \(x_\mathrm {maxlp}\) by following its chain of predecessors and additionally considering all remaining jobs in P in their natural order (i.e., as given in the instance specification) for further addition in a greedy way. The returned upper bound is the priority value of Q’s top element.

4.3 Strengthening of states

Frequently, we can safely replace a state (Pt) by a strengthened state \((P',t')\), with \(P'\subseteq P\) and \(t'_r\ge t_r\), \(r\in R_0\), where either \(P'\subset P\) or \(t'_r>t_r\) for one or more r, \(r\in R_0\), without losing possible solutions. This state strengthening is applied in Algorithm 1 at line 13 to any state that is obtained from the transition function \(\tau \).

By considering the earliest start time \(\mathrm {s}((P,t),j)\) for job j, \(j\in P\), we can first remove all jobs from P that actually cannot be scheduled anymore, i.e., \(P'=\{j\in P\mid \mathrm {s}((P,t),j)\ne T^\mathrm {max}\}\). Then, time \(t'_r,\ r \in R_0\), is set to the earliest possible time when resource r can be used considering all remaining jobs \(P'\), i.e.,

$$\begin{aligned}&t'_0= \min _{j\in P'}\ (\mathrm {s}((P,t),j) + p^{\mathrm {pre}}_j),\end{aligned}$$
(8)
$$\begin{aligned}&t'_r= {\left\{ \begin{array}{ll} \min _{j\in J_r \cap P'}\ \mathrm {s}((P,t),j), &{} \text{ if } J_r \cap P' \ne \emptyset , \\ T^\mathrm {max}, &{} \text{ else, } \end{array}\right. }&r\in R. \end{aligned}$$
(9)

here \(t'_r\) is set to \(T^\mathrm {max}\) if no job that requires resource r remains in \(P'\).

4.4 Upper bounds for the total prize of remaining jobs

For a given state (Pt), an upper bound for the still achievable total prize for the remaining jobs in P can be calculated by solving a linear programming (LP) relaxation of a multi-constrained 0–1 knapsack problem

$$\begin{aligned} {Z^\mathrm {ub}_{\text {MKP-LP}}(P,t) = \hbox {max} \quad }&\sum _{j\in P} z_j x_j&\end{aligned}$$
(10)
$$\begin{aligned} {\hbox {s.t}\quad }&\sum _{j\in P} p^0_j x_j \le W_{0}(P,t),&\end{aligned}$$
(11)
$$\begin{aligned}&\sum _{j\in P \cap J_r} p_j x_j \le W_{r}(P,t),&r \in R, \end{aligned}$$
(12)
$$\begin{aligned}&x_j\in \left[ 0,1 \right] ,&j\in P, \end{aligned}$$
(13)

where \(x_j\) is a continuous relaxation of a binary variable that indicates if job j is included (=1) or not (=0), \(j \in P\). The right-hand-sides of the knapsack constraints are

$$\begin{aligned} W_{0}(P,t)=\left| \bigcup _{\begin{array}{c} j \in P,\\ w=1,\ldots ,\omega _j\mid \\ W^\mathrm {end}_{jw}-p^{\mathrm {post}}_j\ge t_0+p^0_j \end{array}} \left[ \max \left( t_0,W^\mathrm {start}_{jw}+p^{\mathrm {pre}}_j\right) ,W^\mathrm {end}_{jw}-p^{\mathrm {post}}_j\right] \right| \end{aligned}$$
(14)

and

$$\begin{aligned} W_{r}(P,t)=\left| \bigcup _{\begin{array}{c} j\in P\cap J_r,\\ w=1,\ldots ,\omega _j\mid \\ W^\mathrm {end}_{jw}\ge t_r+p_j \end{array}} \left[ \max \left( t_r,W^\mathrm {start}_{jw} \right) ,W^\mathrm {end}_{jw}\right] \right| , \end{aligned}$$
(15)

where the union of intervals is defined as \(\bigcup _{i=1,\ldots ,k}[\alpha _i,\beta _i] = \{\gamma \in {\mathbb {R}} \mid \exists i: \gamma \in [\alpha _i,\beta _i]\}\), and function \(|\cdot |\) denotes the sum of the lengths of the resulting disjoint continuous intervals of this union. Thus, \(W_{0}(P,t)\) and \(W_{r}(P,t)\) represent the total amount of still available time for resource 0 and resource r, \(r \in R\), respectively, considering the current state and the time windows.

To solve this upper bound calculation problem for each state with an LP solver is computationally rather expensive, as our experiments in the next section will document. As alternatives, we therefore consider the computation of upper bounds by solving two types of further relaxations. The first one

$$\begin{aligned} {Z^\mathrm {ub}_{0}(P,t) = \hbox {max} \quad }&\sum _{j\in P} z_j x_j&\end{aligned}$$
(16)
$$\begin{aligned} {\hbox {s.t} \quad }&\sum _{j\in P} p^0_j x_j \le W_{0}(P,t),&\end{aligned}$$
(17)
$$\begin{aligned}&x_j\in \left[ 0,1 \right] ,&j\in P, \end{aligned}$$
(18)

is obtained by removing inequalities (12); the second one

$$\begin{aligned} {h^{\text {ub}}(P,t,u) = \hbox {max} \quad }&\sum _{j\in P} z_j x_j + u \left( W_{0}(P,t)-\sum _{j\in P} p^0_j x_j \right)&\end{aligned}$$
(19)
$$\begin{aligned} {\hbox {s.t} \quad }&\sum _{j\in P \cap J_r} p_j x_j \le W_{r}(P,t),&r \in R, \end{aligned}$$
(20)
$$\begin{aligned}&x_j\in \left[ 0,1\right] ,&j\in P, \end{aligned}$$
(21)

is obtained by performing a Lagrangian relaxation of inequality (11), where \(u \ge 0\) is the Lagrangian dual multiplier associated with inequality (11).

Both \(Z^\mathrm {ub}_{0}(P,t)\) and \(h^{\text {ub}}(P,t,u)\) are computed by solving LP relaxations of knapsack problems. In the latter case, this is possible since the problem separates over the resources and for each resource, the resulting problem is an LP relaxation of a knapsack problem. An LP relaxation of a knapsack problem can be efficiently solved by a greedy algorithm that packs items in decreasing prize/time-ratio order; the first item that does not completely fit is packed partially so that the capacity is exploited as far as possible, see Kellerer et al. (2004).

It follows from weak duality (see e.g. Nemhauser and Wolsey (1988), Prop. 6.1) that \(h^{\text {ub}}(P,t,u)\) yields an upper bound on \(Z^\mathrm {ub}_{\text {MKP-LP}}(P,t)\) for all \(u \ge 0\), but the quality of this upper bound depends on the choice of u. We have chosen to consider \(h^{\text {ub}}(P,t,u)\) for the values \(u=0\) and \(u=z_{{\bar{j}}}/p^0_{{\bar{j}}}\), where \({\bar{j}}\) is the last, and typically partially, packed item in an optimal solution to the problem solved to obtain \(Z^\mathrm {ub}_{0}(P,t)\). The value \(u=z_{{\bar{j}}}/p^0_{{\bar{j}}}\) is chosen since it is an optimal LP dual solution associated with inequality (17) and therefore has a chance to be a good estimate of a value for u that gives a strong upper bound.

By solving the relaxations introduced above, the strongest bound on \(Z^\mathrm {ub}_{\text {MKP-LP}}(P,t)\) we can obtain is

$$\begin{aligned} Z^\mathrm {ub}_{*}(P,t) = \min \left( Z^\mathrm {ub}_{\text {0}}(P,t),h^{\text {ub}}(P,t,0),h^{\text {ub}}(P,t,z_{{\bar{j}}}/p^0_{{\bar{j}}})\right) . \end{aligned}$$
(22)

In our experimental comparisons in Sect. 7, we will illustrate the practical strengths of the bounds

$$\begin{aligned}&Z^\mathrm {ub}_{\text {MKP-LP}}(P,t), \end{aligned}$$
(23)
$$\begin{aligned}&Z^\mathrm {ub}_{\text {0}}(P,t), \end{aligned}$$
(24)
$$\begin{aligned}&Z^\mathrm {ub}_{00}(P,t) = \min \left( Z^\mathrm {ub}_{\text {0}}(P,t),h^{\text {ub}}(P,t,0)\right) , \text{ and } \end{aligned}$$
(25)
$$\begin{aligned}&Z^\mathrm {ub}_{0{{\bar{j}}}}(P,t) = \min \left( Z^\mathrm {ub}_{\text {0}}(P,t),h^{\text {ub}}(P,t,z_{{\bar{j}}}/p^0_{{\bar{j}}}) \right) , \end{aligned}$$
(26)

and study which compromise of strength and computational effort pays off the most in the context of our A* search.

5 A mixed integer programming model

We use the binary variable \(t_{j}\) to indicate if job j, \(j \in J\), is included in the schedule (=1) or not (=0) and the binary variable \(t_{jw}\) to indicate if job j is assigned to time window w (=1) or not (=0), \(w=1,\ldots ,\omega _j\), \(j\in J\). Let the continuous variable \(s_{j}\) be the start time of job j. Binary variable \(y_{jj'}\) is further used to indicate if job j is scheduled before \(j'\) w.r.t. the common resource (=1) or not (=0), if both jobs are scheduled, \(j,j' \in J,\ j\not =j'\).

Let

$$\begin{aligned} \delta _{jj'}&= {\left\{ \begin{array}{ll} p_j, &{}\quad \text{ if } q_{j}=q_{j'}, \\ p^{\mathrm {pre}}_j+p^0_j-p^{\mathrm {pre}}_{j'}, &{}\quad \text{ if } q_{j}\ne q_{j'}, \end{array}\right. } \end{aligned}$$
(27)

be the minimum time between the start of job j and the start of job \(j'\) if job j is scheduled before job \(j'\), which depends on whether both jobs use the same resource or not.

A solution to PC-JSOCMSR can be obtained by solving the MIP model

$$\begin{aligned} {\hbox {max} \quad }&\sum _{j \in J} z_jt_j&\end{aligned}$$
(28)
$$\begin{aligned} {\hbox {s.t} \quad }&t_{j} = \sum _{w=1,\ldots ,\omega _j}t_{jw},&j \in J, \end{aligned}$$
(29)
$$\begin{aligned}&y_{jj'} + y_{j'j}\ge t_{j}+t_{j'} -1,& j,j' \in J,~j \ne j',&\end{aligned}$$
(30)
$$\begin{aligned}&s_{j'} \ge s_{j} + \delta _{jj'} - (T^\mathrm {dead}_{j} -p_j-T^\mathrm {rel}_{j'}+\delta _{jj'})(1-y_{jj'}),&\nonumber \\&j,j' \in J,\ j\not = j' \end{aligned}$$
(31)
$$\begin{aligned}&s_j \ge T^\mathrm {rel}_{j} + \sum _{w=1,\ldots ,\omega _j}\left( W^\mathrm {start}_{jw}-T^\mathrm {rel}_{j}\right) t_{jw},&j\in J, \end{aligned}$$
(32)
$$\begin{aligned}&s_j \le T^\mathrm {dead}_{j}-p_j + \sum _{w=1,\ldots ,\omega _j}(W^\mathrm {end}_{jw}-T^\mathrm {dead}_{j})t_{jw},&j \in J, \end{aligned}$$
(33)
$$\begin{aligned}&t_{j} \in \{0,1\},&j \in J, \end{aligned}$$
(34)
$$\begin{aligned}&t_{jw} \in \{0,1\},&w=1,\ldots ,\omega _j, ~ j \in J, \end{aligned}$$
(35)
$$\begin{aligned}&s_j \in [T^\mathrm {rel}_j,T^\mathrm {dead}_j-p_j],&j \in J, \end{aligned}$$
(36)
$$\begin{aligned}&y_{jj'} \in \{0,1\},&j,j' \in J,\ j \ne j'. \end{aligned}$$
(37)

Equations (29) state that each scheduled job must be assigned to a time window and inequalities (30) ensure that if two jobs j and \(j'\) are scheduled, either \(y_{jj'}\) or \(y_{j'j}\) must be set to one, i.e., one of them needs to precede the other. If a job is to precede another one, inequalities (31) ensure this w.r.t. the jobs’ start times. If a job is assigned to a time window, inequalities (32) and (33) make its start time comply with this time window, and otherwise the job only complies with its release time and deadline.

In the previous work (Horn et al. 2017), a MIP model with position based variables was introduced for JSOCMSR since this model showed better computational performance than a MIP model with order based variables. Such position based model does, however, not extend well to the current setting with multiple time windows since the time windows require explicit knowledge of the start time of each job, and the position based model only has explicit times for the start time of a certain position.

6 A constraint programming model

We further propose the following Constraint Programming (CP) model for the PC-JSOCMSR, which we implemented in the constraint modeling language MiniZinc.Footnote 1 The model makes use of so-called option type variables. Such a variable may either have a value of a certain domain assigned or set to the special value \(\top \) that indicates the absence of a value. For job \(j \in J\) we use the option type variable \(s_j\) for the job’s start time. An absent start time, i.e., \(s_j=\top \), indicates that the job is not scheduled. The CP model is given by

$$\begin{aligned} \max&\sum _{j \in J \mid \mathrm{occurs}(s_j)} z_j&\end{aligned}$$
(38)
$$\begin{aligned}&\mathrm{disjunctive\_strict}(\{(s_j + p^{\mathrm {pre}}_j, p^0_j) \mid j \in J\}),&\end{aligned}$$
(39)
$$\begin{aligned}&\mathrm{disjunctive\_strict}(\{(s_j, p_j) \mid j \in J \wedge q_j = r\}),&r \in R, \end{aligned}$$
(40)
$$\begin{aligned}&\mathrm{occurs}(s_j) \rightarrow \min _{\omega =1,\dots ,\omega _j} W^\mathrm {start}_{j\omega } \le s_j \le \max _{\omega =1,\dots ,\omega _j} (W^\mathrm {end}_{j\omega }-p_j),&j \in J, \end{aligned}$$
(41)
$$\begin{aligned}&s_j \in [T^\mathrm {rel}_j,\dots ,T^\mathrm {dead}_j-p_j] \cup \{ \top \},&j \in J, \end{aligned}$$
(42)

where for job \(j \in J\) the predicate \(\mathrm{occurs}(s_j)\) yields true if the option type variable \(s_j\) is not absent, i.e., job j is scheduled. The strict disjunctive constraints (39) and (40) ensure that all scheduled jobs do not overlap w.r.t. their usage of the common resource and the secondary resource \(r \in R\), respectively. Absent jobs are hereby ignored. Constraints (41) state that if job \(j \in J\) is scheduled, it must be performed within one of the job’s time windows.

7 Experimental results

The proposed A* algorithm from Sect. 4 was implemented in C++ using GNU G++ 5.4.1 for compilation. The MIP model from Sect. 5 was solved with GurobiFootnote 2 Optimizer Version 7.5.1. All tests were performed on a cluster of machines with Intel Xeon E5-2640 v4 processors with 2.40 GHz in single-threaded mode with a CPU time limit of 900 seconds and a memory limit of 15GB per run.

We created three non-trivial benchmark instance sets in order to test our solution approaches. The first two instance sets, called B and S, are, with respect to their basic characteristics, inspired from the particle therapy patient scheduling application, while the third set, called A, exhibits characteristics from the avionic system application. All these instances are available online.Footnote 3 Each set consists of 30 instances for each combination of \(n \in \{10,20,30,40,50,60,70,80,90\}\) jobs and \(m \in \{2,3\}\) secondary resources for the particle therapy based scenario and \(m \in \{3,4\}\) secondary resources for the avionic system based scenario.

The difference between sets B and S lies in the distributions of the secondary resources and each job’s pre-processing, post-processing and \(p^0\) times. Set B consists of instances in which all secondary resources are used by the jobs equally likely and the distribution of processing times is balanced. This was achieved by sampling for job j, \(j \in J\): (1) the secondary resource \(q_j\) from the discrete uniform distribution \({\mathcal {U}}\{1,m\}\), (2) the pre-processing times \(p^{\mathrm {pre}}_j\) and the post-processing times \(p^{\mathrm {post}}_j\) from \({\mathcal {U}}\{0,8\}\) and (3) the times \(p^0_j\) from the random variable \(\mathrm {p}^0_\mathrm {B}\sim {\mathcal {U}}\{1,8\}\). In contrast, instances of set S exhibit a skewed workload such that the secondary resource m is chosen with a probability of 0.5 and all remaining secondary resources with a probability of \(1/(2m-2)\). Furthermore the time to claim the common resource 0 is more dominant by sampling \(p^{\mathrm {pre}}_j\) and \(p^{\mathrm {post}}_j\) from \({\mathcal {U}}\{0,5\}\) and \(p^0_j\) from the random variable \(\mathrm {p}^0_\mathrm {S}\sim {\mathcal {U}}\{1,13\}\). For both instance sets, the prize \(z_j\) is determined in a way to correlate to the usage of the common resource of job j by sampling it from \({\mathcal {U}}\{p^0_j,2p^0_j\}\), \(j \in J\). The time windows are chosen such that, on average about, \(30\%\) of the jobs can be scheduled. For this purpose let \(T_i = \lfloor 0.3 \, n \, \mathrm {E}(\mathrm {p}^0_i) \rfloor \) be the expected maximum resource usage regarding instance type i, \(i \in \{\mathrm {B},\mathrm {S}\}\). First, the number of time windows \(\omega _j\) of job j is sampled from \({\mathcal {U}}\{1,3\}\), i.e., a job can have up to three time windows. Second, for time window w, \(w=1,\dots ,\omega _j\), we sample its start time \(W^\mathrm {start}_{jw}\) from \({\mathcal {U}}\{0, T_i-p_j\}\) and its end time \(W^\mathrm {end}_{jw}\) from \(W^\mathrm {start}_{jw} + \max (p_j, {\mathcal {U}}\{\lfloor 0.1 \, T_i/\omega _j\rfloor ,\lfloor 0.4 \, T_i/\omega _j\rfloor \})\) for instance type i, \(i \in \{\mathrm {B},\mathrm {S}\}\). Overlapping time windows are merged, and all time windows of a job are sorted according to increasing start times.

For the third instance set A, based on the avionic system scenario, a fixed time horizon \(T = 1000\) is considered and the number of jobs of each type is distributed such that \(20\%\) are communication jobs, \(40\%\) are partition jobs and \(40\%\) are regular jobs. For communication jobs the time \(p^0_j\) is set to \(p^0_j=40\) and for partition jobs and regular jobs the time is sampled from \({\mathcal {U}}\{36,44\}\). For partition jobs, the total processing time \(p_j\) is sampled from \({\mathcal {U}}\{5 p^0_j,8 p^0_j\}\) and then, with equal probability, \(p^{\mathrm {pre}}_j\) or \(p^{\mathrm {post}}_j\) is set to 0 and the respective other value is set to \(p_j - p^0_j\). Each partition job is assigned to a secondary resource and each secondary resource has the same probability to be selected.

Since the communication jobs and regular jobs do not use a secondary resource in the real scenario, an artificial secondary resource is introduced and assigned to all of these jobs. This means that the number of secondary resources for an instance is always one more than the number of application modules in the system and that for both the communication jobs and regular jobs \(p_j = p^0_j\).

For partition and regular jobs, the number of time windows and the length of the time windows are computed as in the particle therapy case, but for the communication jobs the structure is rather different. The communication jobs can only be scheduled at certain points in time when the communication can be performed, and here these time points occur at \(0, 80, 160, \dots , 880\). Each time window of a communication job corresponds to one such time point and a job’s total set of time windows corresponds to a number of consecutive such time points. The number of time windows for a communication job is obtained by sampling a value from the uniform distribution \({\mathcal {U}}\{1,3\}\) and multiply it by three.

The prize \(z_j\) is for five of the partition jobs and ten of the communication jobs set to the high value 70 to give these jobs a higher priority, while for remaining partition jobs and communication jobs, the prize is sampled from \({\mathcal {U}}\{10,50\}\). For regular jobs, the prize is sampled from \({\mathcal {U}}\{10,25\}\).

Note that since we only use integral time windows and processing times, all start times can also safely be assumed to be integral. In our implementation, we therefore round down any fractional upper bound to the closest integer value. Note that due to this rounding it is even possible that \(Z^\mathrm {ub}_{*}(P,t)\) yields occasionally tighter bounds than \(Z^\mathrm {ub}_{\mathrm {MKP-LP}}(P,t)\).

7.1 Comparison of upper bound functions

We start by experimentally evaluating the impact of the individual components of our combined upper bound function \(Z^\mathrm {ub}_{*}(P,t)=\min (Z^\mathrm {ub}_{\text {0}}(P,t),h^{\text {ub}}(P,t,0),h^{\text {ub}}(P,t,z_{{\bar{j}}}/p^0_{{\bar{j}}}))\) from Sect. 4.1. To this end, we performed the A* search on all benchmark instances using \(Z^\mathrm {ub}_{*}(P,t)\) to evaluate all states and count for the sub-functions \(Z^\mathrm {ub}_{\text {0}}(P,t)\), \(h^{\text {ub}}(P,t,0)\), and \(h^{\text {ub}}(P,t,z_{{\bar{j}}}/p^0_{{\bar{j}}})\) how often each one of them yields the minimum, i.e., determines \(Z^\mathrm {ub}_{*}(P,t)\). Figure 3 shows the obtained average success rates grouped according to the instance type, the number of jobs n, and the number of secondary resources m for all three upper bounds.

Fig. 3
figure 3

Success rates of upper bound subfunctions \(Z^\mathrm {ub}_{\text {0}}(P,t)\), \(h^{\text {ub}}(P,t,0)\) and \(h^{\text {ub}}(P,t,z_{{\bar{j}}}/p^0_{{\bar{j}}})\) to yield the smallest value, i.e., to determine \(Z^\mathrm {ub}_{*}(P,t)\)

Most importantly we can see that in most cases not a single sub-function is dominating, i.e., it makes sense to calculate all three functions and to combine their results by taking the minimum in order to get a generally tighter bound. More specifically, the success of each sub-function obviously also depends on the specific characteristics of the problem instances. For instances of type B with two secondary resources, \(h^{\text {ub}}(P,t,0)\) is for each instance class on average more than \(50\%\) of the times the strongest upper bound. In all other cases \(h^{\text {ub}}(P,t,z_{{\bar{j}}}/p^0_{{\bar{j}}})\) is on average most successful, and for the instances of type S it is clear that both the bounds \(h^{\text {ub}}(P,t,z_{{\bar{j}}}/p^0_{{\bar{j}}})\) and \(Z^\mathrm {ub}_{\text {0}}(P,t)\) are of importance.

The strongest upper bound function, however, does not necessarily yield the best performing A* search, since the time for calculating the bound also plays a major role. As already stated in Sect. 4.1, we consider the upper bound functions \(Z^\mathrm {ub}_{\mathrm {MKP-LP}}(P,t)\), \(Z^\mathrm {ub}_{\text {0}}(P,t)\), \(Z^\mathrm {ub}_{00}(P,t)\), \(Z^\mathrm {ub}_{0{{\bar{j}}}}(P,t)\), and \(Z^\mathrm {ub}_{*}(P,t)\). The former is solved by the CPLEX 12.7 LP solver in single threaded mode whereas the other upper bound functions make use of the sub-functions \(h^{\text {ub}}(P,t,u)\), \(u\ge 0\) and \(Z^\mathrm {ub}_{\text {0}}(P,t)\) in different ways as stated in Eqs. (22)–(26).

Table 1 presents the aggregated results for each combination of instance type, numbers of jobs, and secondary resources for our A* search using these different upper bound calculations. Columns \(\%\)-opt show the percentage of instances which could be solved to proven optimality. Columns \(\overline{Z^\mathrm {ub}}\) state the average final upper bounds and columns \(\overline{\%\text {-gap}}\) list the average optimality gaps which are calculated by \(100\%\cdot (Z^\mathrm {ub}-Z(\pi ))/Z^\mathrm {ub}\), where \(\pi \) is the final solution and \(Z^\mathrm {ub}\) the final upper bound. Columns t[s] list the median computation times in seconds, whereas columns \(\overline{|W|}\) state the average number of encountered states during the A* search. Best values are printed bold.

Table 1 Average results of A* search for different upper bound functions

In almost all cases, A* search with the combined bound \(Z^\mathrm {ub}_{*}(P,t)\) provides the tightest final bounds. There are only five exceptions where \(Z^\mathrm {ub}_{00}(P,t)\) or \(Z^\mathrm {ub}_{0 {{\bar{j}}}}(P,t)\) yield tighter bounds on average, but \(Z^\mathrm {ub}_{*}(P,t)\) is not far behind. Using the original LP relaxation \(Z^\mathrm {ub}_{\mathrm {MKP-LP}}(P,t)\) yields in almost all cases where not all instances could be solved to optimality worse final upper bounds than using \(Z^\mathrm {ub}_{\text {0}}(P,t)\), \(Z^\mathrm {ub}_{00}(P,t)\), \(Z^\mathrm {ub}_{0 \bar{j}}(P,t)\) or \(Z^\mathrm {ub}_{*}(P,t)\). The reason for this is that, although the full LP relaxation \(Z^\mathrm {ub}_{\mathrm {MKP-LP}}(P,t)\) may provide the tightest upper bound for a single state, substantially less nodes could be processed due to the higher computational effort to solve each LP, cf. columns |W|.

When considering instance sets B and S, in cases where not all instances could be solved to optimality, the A* search with \(Z^\mathrm {ub}_{\text {0}}(P,t)\) was able to provide the smallest average optimality gaps in most cases. For instances of set A the smallest average optimality gaps could be obtained from the A* search with \(Z^\mathrm {ub}_{0 {{\bar{j}}}}(P,t)\) or \(Z^\mathrm {ub}_{*}(P,t)\) and for instances of set S with two secondary resources the smallest average gaps could be obtained from the A* algorithm with \(Z^\mathrm {ub}_{00}(P,t)\) or \(Z^\mathrm {ub}_{*}(P,t)\). This observation is in accordance with our previous observation concerning Fig. 3, where \(h^{\text {ub}}(P,t,0)\) provides more often the strongest upper bound for instances of type S with two secondary resources and where \(Z^\mathrm {ub}_{0 {{\bar{j}}}}(P,t)\) provides frequently more often the strongest upper bounds for instances of type A. We conclude that \(Z^\mathrm {ub}_{\text {0}}(P,t)\) might be a slightly better guidance for instances in sets B and S for our simple greedy heuristic used to find solutions when terminating early.

Considering only instance classes where all instances could be solved to optimality, the A* algorithm with upper bound function \(Z^\mathrm {ub}_{\mathrm {MKP-LP}}(P,t)\) encounters less states than A* with \(Z^\mathrm {ub}_{*}(P,t)\) which in turn encounters less states than A* with one of the other functions \(Z^\mathrm {ub}_{0}(P,t)\), \(Z^\mathrm {ub}_{00}(P,t)\), or \(Z^\mathrm {ub}_{0 {{\bar{j}}}}(P,t)\), respectively. This is not surprising since function \(Z^\mathrm {ub}_{\mathrm {MKP-LP}}(P,t)\) solves the full LP relaxation which provides frequently the strongest upper bounds and \(Z^\mathrm {ub}_{*}(P,t)\) dominates the other functions \(Z^\mathrm {ub}_{0}(P,t)\), \(Z^\mathrm {ub}_{00}(P,t)\), and \(Z^\mathrm {ub}_{0 {{\bar{j}}}}(P,t)\). However, again we see that providing the strongest upper bounds cannot outweigh the disadvantage of the longer computation times such that A* with \(Z^\mathrm {ub}_{*}(P,t)\) terminates in almost all cases substantially earlier than A* with \(Z^\mathrm {ub}_{\mathrm {MKP-LP}}(P,t)\).

Last but not least, we point out that the memory limit of 16GB was the termination reason in several runs for the largest instances. Thus, memory consumption plays a significant role in our A* algorithm. One way to save memory would be to adopt the technique applied in (Horn et al. 2017) where states with the same P are stored in an aggregated fashion. This can be done by storing P only once and include the individual vectors t and further information in an attached list of so-called non-dominated time records.

7.2 Comparison of A* search, MIP, and CP

We finally compare our A* search using the generally dominating upper bound function \(Z^\mathrm {ub}_{*}(P,t)\) to solving the MIP model from Sect. 5 using Gurobi and the CP model from Sect. 6 using MiniZinc 2.1.7 with the backend solver Chuffed. Note that we considered besides Chuffed also the backend solvers Gecode and G12 LazyFD, but Chuffed clearly dominated these alternatives concerning the number of instances solved to proven optimality, as it is documented in more detail in our conference paper Horn et al. (2018b). Table 2 shows the aggregated results. Regarding the number of instances that could be solved to proven optimality, the A* search consistently outperforms Gurobi and Chuffed. For particle therapy based instances of type B and S, A* search could solve all instances with up to 50 jobs to proven optimality, except one skewed instance with three secondary resources. The avionic system based instances of type A are harder to solve. Here, A* search was only able to solve all instances with up to 30 jobs to proven optimality.

The largest instance which A* could solve to proven optimality consists of 80 jobs, whereas the largest instances that Gurobi and Chuffed could solve to proven optimality have 50 and 60 jobs, respectively. Gurobi could solve all avionic based instances with up to 20 jobs, all balanced instances with up to 40 jobs, and all skewed instances with up to 30 jobs to proven optimality. Computation times for those are, however, significantly larger than for A*. In particular for small instances with up to \(n=30\) jobs, A* only required median computation times of no more than 0.1 seconds for instances of type B and S. The CP solver Chuffed could solve all instances of type B and S up to \(n=40\) jobs to optimality and all instances of type A up to \(n=30\) jobs to optimality. The A* algorithm was able to provide equally good or better final solutions than Gurobi and Chuffed in almost all cases for instances of type B and S. Exceptions occurred only for some of the largest instances with 90 jobs, where Gurobi’s heuristic performance proved to be superior.

For instances of type A, Gurobi’s heuristic performance is also superior for instances of smaller sizes. Note, however, that the derivation of just heuristic solutions is not in the foreground of our research here. Concerning obtained upper bounds, the A* search again clearly outperforms the MIP approach by a large margin, especially on the largest instances. Chuffed is not able to return any upper bounds.

Table 2 Average results of A*, MIP, and CP

8 Conclusions and future work

We introduced the PC-JSOCMSR as a practically relevant extended variant of the formerly considered JSOCMSR (Horn et al. 2017). The essential differences are that now the considered jobs have prizes, not all jobs can, in general, be scheduled due the time windows, and that we therefore have to select a subset of the jobs to schedule. Instead of the makespan, we maximize the total prize of the scheduled jobs. This changes in the problem statement substantially affect the structure of the problem and make it in practice much more challenging to solve.

For all our experiments we considered benchmark instances inspired from two practically relevant application scenarios: daily particle therapy scheduling of cancer treatments and the pre-runtime scheduling of avionic systems. We showed that small to medium sized instances of up to about 40 jobs can almost consistently be solved to proven optimality by our A* search. The state graph plays a fundamental role in achieving the reported computational efficiency. Our state graph follows a natural node representation and features a state strengthening, which reduces the number of nodes that must in general be considered. A particularity of our approach is that we only add states that are known to have successor states to the open list and we do not have a single dedicated target node. This also reduces the size of the open list.

Most crucial for the performance of the A* search clearly is the choice of the heuristic function, which in our case is an upper bound for the total prize that may still be achieved from a state. To this end, we considered a relaxation of the PC-JSOCMSR that corresponds to an LP relaxation of a multidimensional knapsack problem. Solving this relaxation for each node under consideration turned out not to be the most effective choice. Instead, further simplifications based on constraint and Lagrangian relaxation, respectively, yield fast-to-calculate upper bound functions \(Z^\mathrm {ub}_0(P,t)\), \(Z^\mathrm {ub}_{00}(P,t)\), \(Z^\mathrm {ub}_{0{{\bar{j}}}}(P,t)\), and \(Z^\mathrm {ub}_*(P,t)\). While \(Z^\mathrm {ub}_*(P,t)\) is the strongest of these four, \(Z^\mathrm {ub}_0(P,t)\) is fastest to compute. In our experiments in the context of the A* search, using \(Z^\mathrm {ub}_*(P,t)\) showed to pay off by usually being the best choice. It yields proven optimal solutions more frequently within the allowed time and memory limits and requires less states to be considered. Not requiring a state-of-the-art LP solver might also be a further advantage in some commercial applications.

We further compared the proposed A* search to an order-based MIP model as well as a MiniZinc CP formulation that was solved with Chuffed. These other approaches, however, are clearly inferior concerning the computation times for proven optimal solutions or obtained upper bounds (actually, the CP approaches are not able to provide any upper bounds in case of early termination). An explanation for the better performance of the A* search besides the well working heuristic function seems to be that it can effectively exploit dynamic programming aspects: Frequently, a state can be reached via multiple different job sequences, and the corresponding subproblem is then only solved once.

In cases where the limits have been reached, slightly better final upper bounds could frequently be achieved by the A* search than from the CP and MIP approaches. However, we remark that our A* search should not be considered for just heuristically solving significantly larger PC-JSOCMSR instances, where it is obvious from the beginning that the search must be terminated much earlier than optimality can be proven: Complete solutions (in the sense that no further jobs can be scheduled) are usually only found very late during the whole search. The approach that we use, that is, to augment the most promising partial solution in a greedy way, was just implemented to ensure that we always return at least one complete solution.

Classical A* search is targeted towards exact solving, which was our focus here. To address substantially larger instances heuristically, other methods like metaheuristics are more appropriate, see Sect. 3. An interesting research direction is also anytime A* search variants, which trade longer runtimes to achieve proven optimality for the benefit of producing promising complete solutions already early.

If our A* terminates before a proven optimal solution could be found then this frequently happens due to exceeding the memory limit. In further research, it may thus be worth to think about even stronger upper bounds for the still achievable prizes of states to reduce the number of state expansions and consequently also the memory usage. A possible way to achieve this could be to use relaxed decision diagrams, which represent discrete relaxations of combinatorial optimization problems and can be seen as alternative to LP-based relaxations.