New Partitioning Techniques and Faster Algorithms for Approximate Interval Scheduling

Compton, Spencer; Mitrović, Slobodan; Rubinfeld, Ronitt

doi:10.1007/s00453-024-01252-1

New Partitioning Techniques and Faster Algorithms for Approximate Interval Scheduling

Published: 18 July 2024

Volume 86, pages 2997–3026, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Algorithmica Aims and scope Submit manuscript

New Partitioning Techniques and Faster Algorithms for Approximate Interval Scheduling

Download PDF

Spencer Compton¹,
Slobodan Mitrović² &
Ronitt Rubinfeld³

80 Accesses
Explore all metrics

Abstract

Interval scheduling is a basic algorithmic problem and a classical task in combinatorial optimization. We develop techniques for partitioning and grouping jobs based on their starting/ending times, enabling us to view an instance of interval scheduling on many jobs as a union of multiple interval scheduling instances, each containing only a few jobs. Instantiating these techniques in a dynamic setting produces several new results. For $(1+\varepsilon )$-approximation of job scheduling of n jobs on a single machine, we develop a fully dynamic algorithm with $O(\nicefrac {\log {n}}{\varepsilon })$ update and $O(\log {n})$ query worst-case time. Our techniques are also applicable in a setting where jobs have weights. We design a fully dynamic deterministic algorithm whose worst-case update and query times are $\text {poly} (\log n,\frac{1}{\varepsilon })$. This is the first algorithm that maintains a $(1+\varepsilon )$-approximation of the maximum independent set of a collection of weighted intervals in $\text {poly} (\log n,\frac{1}{\varepsilon })$ time updates/queries. This is an exponential improvement in $1/\varepsilon $ over the running time of an algorithm of Henzinger, Neumann, and Wiese [SoCG, 2020]. Our approach also removes all dependence on the values of the jobs’ starting/ending times and weights.

Dynamic Interval Scheduling for Multiple Machines

A parameterized complexity view on non-preemptively scheduling interval-constrained jobs: few machines, small looseness, and small slack

Article 08 April 2016

Dynamic Algorithms for Multimachine Interval Scheduling Through Analysis of Idle Intervals

Article 27 April 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Job scheduling is a fundamental task in optimization, with applications ranging from resource management in computing [1, 2] to operating transportation systems [3]. Given a collection of machines and a set of jobs (or tasks) to be processed, the goal of job scheduling is to assign those jobs to the machines while respecting certain constraints. Constraints set on jobs may significantly vary. In some cases, a job has to be scheduled, but the starting time of its processing is not pre-specified. In other scenarios, a job can only be scheduled at a given time, but there is flexibility on whether to process the job or not. Frequent objectives for this task can include either maximizing the number of scheduled jobs or minimizing the needed time to process all given jobs.

An important variant of job scheduling is the task of interval scheduling: here, each job has a specified starting time and length, but a job is not required to be scheduled. Given M machines, the goal is to schedule as many jobs as possible. More generally, each job is also assigned a reward or weight, which can be thought of as a payment received for processing the given job. If a job is not processed, the payment is zero, i.e., there is no penalty. We refer to this variant as weighted interval scheduling. This problem, in a natural way, captures real-life scenarios. For instance, consider assigning crew members to flights where we aim to assign crews for as many flights as possible. In the context of interval scheduling, flights can be seen as jobs and the crew members as machines [3, 4]. Interval scheduling also has applications in geometrical tasks – it can be seen as a task of finding a collection of non-overlapping geometric objects. In this context, its prominent applications are in VLSI design [5] and map labeling [6, 7].

The aforementioned scenarios are executed in different computational settings. For instance, some use cases are dynamic in nature, e.g., a flight gets canceled. Then, in certain cases, we have to make online decisions, e.g., a customer must know immediately whether we are able to accept its request or not. In some applications, there might be so many requests that we would like to design extremely fast ways of deciding whether a given request/job can be scheduled, e.g., providing an immediate response to a user submitting a job for execution in a cloud.

1.1 The Computation Model

In our work, we focus on the dynamic setting of computation. Our algorithms for the fully dynamic setting design data structures that maintain an approximately optimal solution to an instance of the interval scheduling problem while supporting insertions and deletions of jobs/intervals. The data structures also support queries of the maintained solution’s total weight and whether or not a particular interval is used in the maintained solution.

1.2 Our Results

Our first result, given in Sect. 4, focuses on designing an efficient dynamic algorithm for unweighted interval scheduling on a single machine. Prior to our work, the state-of-the-art result for this unweighted interval scheduling problem was due to [8], who design an algorithm with $O(\nicefrac {\log {n}}{\varepsilon ^2})$ update and query time. We provide an improvement in the dependence on $\varepsilon $.

Theorem 1.1

(Unweighted dynamic, single machine) Let ${\mathcal {J}}$ be a set of n jobs. For any $\varepsilon > 0$, there exists a fully dynamic algorithm for $(1+\varepsilon )$-approximate unweighted interval scheduling for ${\mathcal {J}}$ on a single machine performing updates in $O\left( \frac{\log (n)}{\varepsilon } \right) $ and queries in $O(\log (n))$ worst-case time.

Theorem 1.1 can be seen as a warm-up for our most challenging and technically involved result, which is an algorithm for the dynamic weighted interval scheduling problem on a single machine. We present our approach in detail in Sect. 5. As a function of $1/\varepsilon $, our result is an exponential improvement compared to the running times obtained in [9]. We also remove all dependence on the job starting/ending times (previous work crucially used assumptions on the coordinates to bound the ratio of jobs’ lengths by a parameter N), and remove all dependence on the value of the job rewards.

Theorem 1.2

(Weighted dynamic, single machine) Let ${\mathcal {J}}$ be a set of n weighted jobs. For any $\varepsilon > 0$, there exists a fully dynamic algorithm for $(1+\varepsilon )$-approximate weighted interval scheduling for ${\mathcal {J}}$ on a single machine performing updates and queries in worst-case time $T \in \text {poly} (\log n,\frac{1}{\varepsilon })$. The exact complexity of T is given by

$$\begin{aligned} O\left( \frac{\log ^{12}(n)}{\varepsilon ^{7}} + \frac{\log ^{13}(n)}{\varepsilon ^{6}} \right) . \end{aligned}$$

1.3 Related Work

The closest prior work to ours is that of Henzinger et al. [9], and Bhore et al. [8]. The work of Henzinger et al. studies $(1+\varepsilon )$-approximate dynamic interval scheduling for one machine in both the weighted and unweighted setting. Unlike our main result in Theorem 1.2, they assume that: jobs have rewards within [1, W]; jobs have length at least 1; and jobs start/end within times [0, N]. They obtain deterministic algorithms with $O(\exp (1/\varepsilon ) \log ^2{n} \cdot \log ^2{N})$ update time for the unweighted and $O(\exp (1/\varepsilon ) \log ^2{n} \cdot \log ^5{N} \cdot \log {W})$ update time for the weighted case. They cast interval scheduling as the problem of finding a maximum independent set among a set of intervals on the x-axis. The authors extend this setting to multiple dimensions and design algorithms for approximating the maximum independent set among a set of d-dimensional hypercubes, achieving a $(1+\varepsilon ) 2^d$-approximation in the unweighted and a $(4 + \varepsilon )2^d$-approximation in the weighted regime.

The authors of [8] primarily focus on the unweighted case of approximating maximum independent set of a set of cubes. For the 1-dimensional case, which equals interval scheduling on one machine, they obtain $O(\nicefrac {\log {n}}{\varepsilon ^2})$ update time, which is slower by a factor of $1/\varepsilon $ than our approach. They also show that their approach generalizes to the d-dimensional case, requiring $\text {poly} \log {n}$ amortized update time and providing $O(4^d)$ approximation.

The problem of dynamically maintaining an exact solution to interval scheduling on one or multiple machines is studied by [10]. They attain a guarantee of ${\tilde{O}}(n^{1/3})$ update time for unweighted interval scheduling on $M=1$ machine, and ${\tilde{O}}(n^{1-1/M})$ for $M \ge 2$. Moreover, they show an almost-linear time conditional hardness lower bound for dynamically maintaining an exact solution to the weighted interval scheduling problem on even just $M=1$ machine. This further motivates work such as ours that dynamically maintains approximate solutions for weighted interval scheduling.

The authors of [11] consider dynamic interval scheduling on multiple machines in the setting where all the jobs must be scheduled. The worst-case update time of their algorithm is $O(\log (n)+d)$, where d refers to the depth of what they call idle intervals (depth meaning the maximal number of intervals that contain a common point); they define an idle interval to be the period in a schedule between two consecutive jobs in a given machine. The same set of authors, in [12], also study dynamic algorithms for the monotone case, in which no interval completely contains another one. For this setup, they obtain an algorithm with $O(\log (n))$ update and query time.

In the standard model of computing (i.e., one processor, static), there exists an $O(n+m)$ running time algorithm for (exactly) solving the unweighted interval scheduling problem on a single machine with n jobs and integer coordinates bounded by m [13]. An algorithm with running time independent of m is described in [14], where it is shown how to solve this problem on M machines in $O(n \log (n))$ time. An algorithm is designed in [15] for weighted interval scheduling on M machines that runs in $O(n^2 \log (n))$ time.

We refer a reader to [3] and references therein for additional applications of the interval scheduling problem.

Other related work. There has also been a significant interest in job scheduling problems in which our goal is to schedule all the given jobs across multiple machines, with the objective to minimize the total scheduling time. Several variants have been studied, including setups that allow preemptions or setting where jobs have precedence constraints. We refer a reader to [16,17,18,19,20,21,22] and references therein for more details on these and additional variants of job scheduling. Beyond dynamic algorithms for approximating maximum independent sets of intervals or hypercubes, [23] show results for geometric objects such as disks, fat polygons, and higher-dimensional analogs. After we had published a preprint of this work, [23] proved a result for dynamic data structures approximating the maximum independent set of fat objects. As discussed in their Section 6, this subsumes our Theorem 1.1.

2 Problem Setup

In the interval scheduling problem, we are given n jobs and M machines. With each job j are associated two numbers $s_j$ and $l_j > 0$, referring to “start” and “length” respectively, meaning that the job j takes $l_j$ time to be processed and its processing can only start at time $s_j$. The job then finishes at time $f_j = s_j + l_j$. In addition, with each job j is associated weight/reward $w_j > 0$, that refers to the reward for processing the job j. The task of interval scheduling is to schedule jobs across machines while maximizing the total reward and respecting that each of the M machines can process at most one job at any point in time.

3 Overview of Our Techniques

Our primary goal is to present unified techniques for approximating scheduling problems that can be turned into efficient algorithms for many settings. In this section, we discuss key insights of our techniques.

In the problems our work tackles, partitioning the problem instance into independent, manageable chunks is crucial. Doing so enables an LCA to determine information about a job of interest without computing an entire schedule, or enables a dynamic data structure to maintain a solution without restarting from scratch.

3.1 Unweighted Interval Scheduling—Partitioning Over Time (Sect. 4)

For simplicity of presentation, we begin by examining our method for partitioning over time for just the unweighted interval scheduling problem on one machine (i.e., $M=1$). In particular, we first focus on doing so for the dynamic setting.

Recall that in this setting, the primary motivation for partitioning over time is to divide the problem into independent, manageable chunks that can be utilized by a data structure to quickly modify a solution while processing an update. In our work, we partition the time dimension by maintaining a set of borders that divide time into some contiguous regions. By doing so, we divide the problem into many independent regions, and we ignore jobs that intersect multiple regions; equivalently, we ignore jobs that contain a border. Our goal is then to dynamically maintain borders in a way such that we can quickly recompute the optimal solution completely within some region, and that the suboptimality introduced by these borders does not affect our solution much. In Sect. 4, we show that by maintaining borders where the optimal solution inside each region, i.e., a time-range between two borders, is of size $\Theta (\frac{1}{\varepsilon })$, we can maintain a $(1+\varepsilon )$-approximation of an optimal solution as long as we optimally compute the solution within each region.

Here, the underlying intuition is that because each region has a solution of size $\Omega (\frac{1}{\varepsilon })$, we can charge any suboptimality caused by a border against the selected jobs in an adjacent region. Likewise, because each region’s solution has size $O(\frac{1}{\varepsilon })$, we are able to recompute the optimal solution within some region quickly using a balanced binary search tree. We dynamically maintain borders satisfying our desired properties by adding a new border when a region becomes too large, or merging with an adjacent region when a region becomes too small. As only O(1) regions will require any modification when processing an update, this method of partitioning time, while simple, enables us to improve the fastest known update/query time to $O(\log (n)/\varepsilon )$.^{Footnote 1} In Sect. 3.2 we build on these ideas to design an algorithm for the weighted interval scheduling problem.

3.2 Weighted Interval Scheduling (Sect. 5)

In our most technically involved result, we design the first deterministic $(1+\varepsilon )$ approximation algorithm for weighted interval scheduling that runs in $\text {poly} (\log n,\frac{1}{\varepsilon })$ time. In this section, we give an outline of our techniques and discuss key insights. For full details, we refer a reader to Sect. 5.

3.2.1 Job Data Structure

Let ${\mathcal {E}}$ be the set of all the endpoints of given jobs, i.e., ${\mathcal {E}}$ contains $s_i$ and $f_i$ for each job $[s_i, f_i]$. We build a hierarchical data structure over ${\mathcal {E}}$ as follows. This structure is organized as a binary search tree T. Each node Q of T contains a value $\textsc {key}(Q) \in {\mathcal {E}}$, with a “1-1” mapping between ${\mathcal {E}}$ and the nodes of T. Each node Q is responsible for a time range. The root of T, that we denote by $Q_{root}$, is responsible for the entire time range $(-\infty , \infty )$. Each node Q has at most two children, that we denote by $Q_L$ and $Q_R$. If Q is responsible for the time range [X, Y], then $Q_L$ is responsible for $[X, \textsc {key}(Q) ]$, while $Q_R$ is responsible for $[\textsc {key}(Q), Y]$.

Jobs are then assigned to nodes, where a job J is assigned to every node Q such that J is contained within the Q’s responsible time range.

3.2.2 Organizing Computation (Sect. 5.1)

We now outline how the structure T is used in computation. As a reminder, our main goal is to compute a $(1+\varepsilon )$-approximate weighted interval scheduling. This task is performed by requesting $Q_{root}$ to solve the problem for the range $(-\infty , \infty )$. However, instead of computing the answer for the entire range $(-\infty , \infty )$ directly, $Q_{root}$ partitions the range $(-\infty , \infty )$ into:

at most $\text {poly} (n,1/\varepsilon )$ ranges over which it is relatively easy to compute approximate solutions, such ones are called sparse, and
at most $\text {poly} (n,1/\varepsilon )$ remaining ranges over which it is relatively hard to compute approximate solutions at the level of $Q_{root}$.

These hard-to-approximate ranges are deferred to the children of $Q_{root}$, and are hard to approximate because any near-optimal solution for the range contains many jobs. On the other hand, solutions in sparse ranges are of size $O(1/\varepsilon )$. As discussed later, approximate optimal solutions within sparse ranges can be computed efficiently; for details, see the paragraph Approximate dynamic programming below.

In general, a child $Q_C$ of $Q_{root}$ might receive multiple ranges from $Q_{root}$ for which it is asked to find an approximately optimal solution. $Q_C$ performs computation in the same manner as $Q_{root}$ did – the cell $Q_C$ partitions each range it receives into “easy” and “hard” to compute subranges. $Q_C$ computes the first type of subranges, while the second type is deferred to the children of $Q_C$. These “hard” ranges have large weight and allow for drawing a boundary and hence dividing a range into two or more independent ranges. We now discuss how the partitioning into ranges is undertaken.

3.2.3 Auxiliary Data Structure (Sect. 5.2)

To divide a range into “easy” and “hard” ranges at the level of a node Q, we design an auxiliary data structure, which relates to a rough approximation of the problem. This structure, called Z(Q), maintains a set of points (we call these points grid endpoints) that partition Q into slices of time. We use slice to refer to a time range between two consecutive points of Z(Q). Recall how for unweighted interval scheduling, we maintained a set of borders and ignored a job that crossed any border. In the weighted version, we will instead use Z(Q) as a set of partitions from which we will use some subset to divide time. Our method of designing Z(Q) reduces the task of finding a partitioning over time Z(Q) within a cell for the $(1+\varepsilon )$-approximate weighted interval scheduling problem to finding multiple partitionings for the $(1+\varepsilon )$-approximate unweighted problem.

It is instructive to think of Z(Q) in the following way. First, we view weighted interval scheduling as $O(\log n)$ independent instances of unweighted interval scheduling – instance i contains the jobs having weights in the interval $(w_{max}(Q)/2^{i+1}, w_{max}(Q)/2^{i}]$. Then, for each unweighted instance we compute borders as described in Sect. 3.1. Z(Q) constitutes a subset of the union of those borders across all unweighted instances. We point out that the actual definition of Z(Q) contains some additional points that are needed for technical reasons, but in this section we will adopt this simplified view. In particular, as we will see, Z(Q) is designed such that the optimal solution within each slice has small total reward compared to the optimal solution over the entirety of Q. This enables us to partition the main problem into subproblems such that the suboptimality of discretizing the time towards slices, that we call snapping, is negligible.

However, a priori, it is not even clear that such structure Z(Q) exists. So, one of the primary goals in our analysis is to show that there exists a near-optimal solution of a desirable structure that can be captured by Z(Q). The main challenge here is to detect/localize sparse and dense ranges efficiently and in a way that yields a fast dynamic algorithm. As an oversimplification, we define a solution as having nearly-optimal sparse structure if it can be generated with roughly the following process:

Each cell Q receives a set of disjoint time ranges for which it is supposed to compute an approximately optimal solution using jobs assigned to Q or its descendants. Each received time range must have starting and ending time in Z(Q).
For each time range ${\mathcal {R}}$ that Q receives, the algorithm partitions ${\mathcal {R}}$ into disjoint time ranges of three types: sparse time ranges, time ranges to be sent to $Q_L$ for processing, and time ranges to be sent to $Q_R$ for processing. In particular, this means that subranges of ${\mathcal {R}}$ are deferred to the children of Q for processing.
For every sparse time range, Q computes an optimal solution using at most $\nicefrac {1}{\varepsilon }$ jobs.
The union of the reward/solution of all sparse time ranges on all levels must be a $(1+\varepsilon )$-approximation of the globally optimal solution without any structural requirements.

Moreover, we develop a charging method that enables us to partition each cell with only $|Z(Q)| = \text {poly} (\nicefrac {1}{\varepsilon },\log (n))$ points and still have the property that it contains a $(1+\varepsilon )$-approximately optimal solution with nearly-optimal sparse structure. Then, we design an approximate dynamic programming approach to efficiently compute near-optimal solutions for sparse ranges. Combined, this enables a very efficient algorithm for weighted interval scheduling. On a high-level, Z(Q) enables us to eventually decompose an entire solution into sparse regions.

3.2.4 The Charging Method (Sect. 5.2.3)

We now outline insights of our charging arguments that enable us to convert an optimal solution OPT into a near-optimal solution $OPT'$ with nearly-optimal sparse structure while relaxing our partitioning to only need $|Z(Q)| = \text {poly} (\nicefrac {1}{\varepsilon },\log (N))$ points. For a visual aid, see Fig. 2.

As outlined in our overview of the nearly-optimal sparse structure, each cell Q receives a set of disjoint time ranges, with each time range having endpoints in Z(Q), and must split them into three sets: sparse time ranges, time ranges for $Q_L$, and time ranges for $Q_R$. We will now modify OPT by deleting some jobs. This new solution will be denoted by $OPT'$ and will have the following properties:

1.
$OPT'$ exhibits nearly-optimal sparse structure; and
2.
$OPT'$ is obtained from OPT by deleting jobs of total reward at most $O(\varepsilon \cdot w(OPT))$.

We outline an example of one such time range a cell Q may receive in Fig. 2, annotated by “received range ${\mathcal {R}}$”. We will color jobs in Fig. 2 to illustrate aspects of our charging argument, but note that jobs do not actually have a color property beyond this illustration. Since our structure only allows a cell Q to use a job within its corresponding time range, any relatively valuable job that crosses between $Q_L$ and $Q_R$ must be used now by Q putting it in a sparse time range. One such valuable job in Fig. 2 is in blue marked by “B”. To have “B” belong to a sparse range, we must divide the time range ${\mathcal {R}}$ somewhere, as otherwise our solution in the received range will be dense. If we naively divide ${\mathcal {R}}$ at the partition of Z(Q) to the left and right of the job “B”, we might be forced to delete some valuable jobs; such jobs are pictured in green and marked by “G”. Instead, we expand the division outwards in a more nuanced manner. Namely, we keep expanding outwards and looking at the job that contains the next partition point (if any). If the job’s value exceeds a certain threshold, as those pictured as green and marked by “G” in Fig. 2, we continue expanding. Otherwise, the job crossing a partition point is below a certain threshold, pictured as brown and not marked in Fig. 2, and its deletion can be charged against the blue job. We delete such brown jobs, and the corresponding partition points, i.e., the vertical red lines crossing those brown jobs, constitute the start and the end of the sparse range. By the end, we decided the starting and ending time of the sparse range, and what remains inside are blue job(s), green job(s), and yellow job(s) (also marked by “Y”). Note that yellow jobs must be completely within a partition slice of Z(Q). Since we define Z(Q) such that the optimal total reward within any grid slice is small, the yellow jobs have relatively small rewards compared to the total reward of green and blue jobs that we know must be large. Accordingly, we can delete the yellow jobs (to help make this time range’s solution sparse) and charge their cost against a nearby green or blue job. In Fig. 2, an arrow from one job to another represents a deleted job pointing towards the job which we charge its loss against. Finally, each sparse range contains only green job(s) and blue job(s). If there are more than $\nicefrac {1}{\varepsilon }$ jobs in such a sparse range, we employ a simple sparsifying step detailed in the full proof.

It remains to handle the time ranges of the received range that were not put in sparse ranges. These time ranges are sent to $Q_L$ and $Q_R$. In Fig. 2, these ranges are outlined in yellow and annotated by “child subproblem”. However, the time ranges do not necessarily align with $Z(Q_L)$ or $Z(Q_R)$ as is required by nearly-optimal sparse structure. We need to adjust these ranges to align with $Z(Q_L)$ or $Z(Q_R)$ so we can send the ranges to the children. See Fig. 3 for intuition on why we cannot just immediately “snap” these child subproblems to the partition points in $Z(Q_L)$ and $Z(Q_R)$. (We say that a range ${\mathcal {R}}$ is snapped inward (outward) within cell Q if ${\mathcal {R}}$ is shrunk (extended) on both sides to the closest points in Z(Q). Inward snapping is illustrated in Fig. 3.) Instead, we employ a similar charging argument to deal with snapping. As an analog to how we expanded outwards from the blue job for defining sparse ranges, we employ a charging argument where we contract inwards from the endpoints of the child subproblem. In summary, these charging arguments enabled us to show a solution of nearly-optimal sparse structure exists even when only partitioning each cell Q with $|Z(Q)| = \text {poly} (\nicefrac {1}{\varepsilon },\log (n))$ points.

3.2.5 Approximate Dynamic Programming (Sect. 5.3)

Now, we outline our key advance for more efficiently calculating the solution of nearly-optimal sparse structure. This structure allows us to partition time into ranges with sparse solutions. More formally, we are given a time range and we want to approximate an optimal solution within that range that uses at most $\nicefrac {1}{\varepsilon }$ jobs. We outline an approximate dynamic programming approach that only requires polynomial time dependence on $\nicefrac {1}{\varepsilon }$.

The relatively well-known dynamic programming approach for computing weighted interval scheduling is maintaining a dynamic program where the state is a prefix range of time, and the output is the maximum total reward that can be obtained in that range of time. However, for our purposes, there are too many possibilities for prefix ranges of time to consider. Instead, we invert the dynamic programming approach and have a state referencing some amount of reward, where the dynamic program returns the minimum length prefix range of time in which one can obtain a given reward. Unfortunately, there are also too many possible amounts of rewards. We observe that we do not actually need this exact state but only an approximation. In particular, we show that one can round this state down to powers of $(1+\varepsilon ^2)$ and hence significantly reduce the state space. In Sect. 5.3, we show how one can use this type of observation to quickly compute approximate dynamic programming for a near-optimal sparse solution inside any time range.

3.2.6 Comparison with Prior Work

The closest to our work is the one of [9]. In terms of improvements, we achieve the following: we remove the dependence on N and $w_{\textrm{max}}$ in the running-time analysis; and, we design an algorithm with $\text {poly} (1/\varepsilon , \log n)$ update/query time, which is exponentially faster in $1/\varepsilon $ compared the prior work.

In this prior work, jobs are assumed to have lengths of at least 1 and belong in the time-interval [1, N]. To remove the dependence on N and such assumptions, we designed a new way of bookkeeping jobs. Instead of using a complete binary tree on [1, N] to organize jobs as done in the prior work, we construct a binary balanced search tree on the endpoints of jobs. A complete binary tree on [1, N] is oblivious to the density of jobs. On the other hand, and intuitively, our approach allows for “instance-based” bookkeeping: the jobs are in a natural way organized with respect to their density. Resorting to this approach incurs significant technical challenges. Namely, the structure of the solution our tree maintains is hierarchically organized. However, each tree update, which requires node-rotations, breaks this structure which requires additional care in efficiently maintaining approximate solution after an update, as well as requiring an entirely different approach for maintaining a partitioning of time Z(Q) within cells. Moreover, we show how to leverage these ideas further to obtain a deterministic approach.

In our work, we use borders to define the so-called sparse and dense ranges. This idea is inspired by the work of [9]. We emphasize, though, that one of our main contributions and arguably the most technically involved component is showing how to algorithmically employ those borders in running-time only polynomially dependent on $1/\varepsilon $, while [9] require exponential dependence on $1/\varepsilon $.

Our construction of auxiliary data structure Z(Q) enables us to boost an $O(\log (n))$-approximate solution into a decomposition enabling a $(1+\varepsilon )$-approximate solution is inspired by the approach of [9]. They similarly develop Z(Q) to boost an instead O(1)-approximation that fundamentally relies on the bounded coordinate assumptions of jobs being within [1, N] and having length at least 1. Our different approach towards Z(Q) enables simplification of some arguments, or on length or bounded coordinate assumptions. Further, we note that the dynamic programming approach for sparse regions that we develop is significantly faster than the enumerative approach used in the prior work, that eventually enables us to obtain a $\text {poly} (1/\varepsilon )$ dependence in the running time. The way we combine solutions over sparse regions is similar to the way it is done in the prior work.

4 Dynamic Unweighted Interval Scheduling on a Single Machine

In this section, we prove Theorem 1.1. As a reminder, Theorem 1.1 considers the case of interval scheduling in which $w_j = 1$ for each j and $M = 1$, i.e., the jobs have unit reward and there is only a single machine at our disposal. This case can also be seen as a task of finding a maximum independent set among intervals lying on the x-axis. The crux of our approach is in designing an algorithm that maintains the following invariant:

We will maintain this invariant unless the optimal solution has fewer than $\nicefrac {1}{\varepsilon }$ intervals, in which case we are able to compute the solution from scratch in negligible time. We aim for our algorithm to maintain Invariant 1 while keeping track of the optimal solution between each pair of consecutive borders. The high-level intuition for this is that if we do not maintain too many borders, then our solution must be very good (our solution decreases in size by at most 1 every time we add a new border). Furthermore, if the optimal solution within borders is small, it is likely easier for us to maintain said solutions. We prove that this invariant enables a high-quality approximation:

Lemma 4.1

A solution that maintains an optimal solution within consecutive pairs of a set of borders, where the optimal solution within each pair of consecutive borders contains at least K intervals, maintains a $\frac{K+1}{K}$-approximation.

Proof

For our analysis, suppose there are implicit borders at $-\infty $ and $+\infty $ so that all jobs are within the range of borders. Consider an optimal solution OPT. We will now design a K-approximate optimal solution $OPT'$ as follows: given OPT, delete all intervals in OPT that overlap a drawn border. Fix an interval J appearing in OPT but not in $OPT'$. Assume that J intersects the i-th border. Recall that between the $(i-1)$-st and the i-th border there are at least K intervals in $OPT'$. Moreover, at most one interval from OPT intersects the i-th border. Hence, to show that $OPT'$ is a $\frac{K+1}{K}$-approximation of OPT, we can charge the removal of J to the intervals appearing between the $(i-1)$-st and the i-th border in $OPT'$. $\square $

Not only does Invariant 1 enable high-quality solutions, but it also assists us in quickly maintaining such a solution. We can maintain a data structure with $O(\frac{\log (n)}{\varepsilon })$ updates and $O(\log (n))$ queries that moves the borders to maintain the invariant and thus maintains an $(1+\varepsilon )$-approximation as implied by Lemma 4.1.