1 Introduction

The shortest augmenting path (Sap) algorithm is one of the most classical approaches to the maximum matching and maximum flow problems. Using this idea Edmonds and Karp in 1972 have shown the first strongly polynomial time algorithm for the maximum flow problem [5]. Quite astonishingly, although this idea is one of the most basic algorithmic techniques, it is far from being fully understood. It is easier to talk about it by introducing the online bipartite matching problem. In this problem a bipartite graph G = (WB, E) is being revealed online, i.e., in each round one vertex from B with its incident edges arrives. After arrival of this vertex we augment this matching by using shortest augmenting path. It was conjectured by Chaudhuri et al. [4] that the total length of augmenting paths found by Sap is \(\mathcal {O}(n \log n)\). However, no better bound than \(\mathcal {O}(n^{2})\) is known even for trees. Proving this conjecture would have quite striking consequences even for maximum flow problem, as it would show that the total length of augmenting paths in unit capacity networks in Edmonds-Karp algorithm is \(\mathcal {O}(m\log n)\). This consequence is obtained via the bipartite line graph construction that is used to reduce the max-flow problem to maximum matching problem [10]. The obtained bipartite line graph has 2m vertices.

Our paper contributes to the study of Sap algorithm by showing that in the case of trees the total length of all augmenting paths is bounded by \(\mathcal {O}(n \log ^{2} n)\). This result is obtained via the application of the heavy-light decomposition of trees [16] combined with charging technique that carefully assigns shortest augmenting paths to the structure of the tree. Although, this result seems to be restricted only to trees we be believe that it constitutes the first nontrivial progress towards resolving the above conjecture. Moreover, we actually conjecture here that trees are the worst-case examples for this problem. It seems that adding more edges can only help the Sap algorithm. In addition to that we explain why Sap is harder to analyze than other augmenting path algorithms, even though it seems way more natural.

2 Related Work

The online bipartite matching problem with augmentations has recently received increasing research attention [3, 4, 6, 7]. There are several reasons to study this problem. First of all, it provides a simple solution to the online bipartite matching algorithms used in many modern applications such as online advertising (e.g., Google Ads) [12] or client-server assignment [4]. Secondly, they could give rise to new effective offline bipartite matching algorithms as in [3]. Those new algorithms provide new insights to the old problem that was studied for decades.

In this paper we concentrate on bounding the total length of augmenting paths and not on the running time. With this respect, it was shown that if the vertices of B appear in a random order, the expected total paths’ length for Sap is \(\mathcal {O}(n \log n)\) [4]. The worst-case total length of paths remains an open question even for trees. In the class of trees the authors of [4] proposed a different augmenting path algorithm that achieves total paths’ length of \(\mathcal {O}(n \log n)\). On the other hand, for general bipartite graphs greedy ranking algorithm [3] guarantees \(\mathcal {O}(n \sqrt {n})\) total length of paths.

First of all, the above study of online bipartite matching with augmentations should be related to the work of Gupta et al. [7] which shows an \(\mathcal {O}(n)\) bound on the total length of paths, but allows to exceed the capacity of each server by a constant factor.

Another point of view is given by the dynamic matching algorithms. Most papers in this area consider edge updates in a general fully-dynamic model which allows for both insertions and deletions intermixed with each other. We note, however, that the exact results in this model [9, 15] do not imply any bound on the number of changes to the matching. Much faster update times can be achieved by constant approximate algorithms, for example [1, 14], which achieve polylogarithmic and logarithmic update times. Yet, the 2-approximation can be obtained in our setting by trivial greedy algorithm that preforms no changes at all.

Better approximation factor of \(\frac {3}{2}\) was achieved by [13] in \(\mathcal {O}(\sqrt {m})\) update time, and then improved by Gupta and Peng to (1 + ε) in \(\mathcal {O}(\sqrt {m}\varepsilon ^{-2})\) [8]. The \(\mathcal {O}(\sqrt {m})\) barrier was broken by Bernstein and Stein who gave a \((\frac {3}{2}+\varepsilon )\)-approximation algorithm that achieves \(\mathcal {O}(m^{1/4}\varepsilon ^{-2.5})\) update time [2]. The same paper proposes an (1 + ε)-approximation algorithm in very fast \(\mathcal {O}(\alpha (\alpha + \log n) + \varepsilon ^{-4}(\alpha + \log n) + \varepsilon ^{-6})\) update time for the special case of bipartite graphs with constant arboricity. However, when allowing approximation in our model a much better results are possible. An (1 + ε) approximation in \(\mathcal {O}(m\varepsilon ^{-1})\) total time and with \(\mathcal {O}(n\varepsilon ^{-1})\) total length of paths was shown in [3].

3 Preliminaries

We consider the following matching problem. Let W and B be two sets of vertices over which the bipartite graph will be formed. The set W (called white vertices) is given up front to the algorithm, whereas the vertices in B (black vertices) arrive online. We denote by G t = 〈WB t ,E t 〉 the bipartite graph after the t’th black vertex has arrived. The graph G t is constructed online in the following manner. We start with G0 = 〈WB0,E0〉 = 〈W, 〉. In turn t a new vertex b t B together with all its incident edges E(b t ) is revealed and G t is defined as:

$$\left\{\begin{array}{l} E_{t} = E_{t-1}\cup E(b_{t}),\\ B_{t} = B_{t-1}\cup { \left\{ {b_{t}} \right\}}. \end{array}\right. $$

The goal of our algorithm is to compute for each G t the maximum size matching M t . For simplicity we assume that we add in total |W| black vertices. The final graph G|W| which is obtained in this process will be denoted by G = (WB, E). We denote n = |W| = |B| and m = |E|.

For every t ∈ [n], we add orientation to edges of the graph G t . This orientation is induced by matching M t : the matched edges are oriented towards black vertices, while the unmatched edges are oriented towards white vertices. When a new vertex b t arrives, we get an intermediate orientation Gint t = (Eint t ,B t ), where the edges of b t are oriented towards its neighbors, and the rest of the edges is oriented according to Mt− 1. Note that Gint t and Gt− 1 differ only by one vertex b t . Any simple directed path in Gint t from b t to some unmatched white vertex is an augmenting path. In turn t, if b t can be matched, the edges of Gint t are reoriented along augmenting path π t chosen by the algorithm, and the resulting orientation is G t . The unmatched white vertices are called seeds. We denote the set of seeds after turn t as:

$$S_{t} = { \left\{ {w \in W : wb \notin M_{t} \text{ for any } b \in B} \right\} }. $$

So in turn t the augmenting paths in Gint t are the directed paths from b t to some sSt− 1. We refer to the seed of the path π t from turn t as s t , where s t St− 1. We represent a path as a graph consisting of path vertices and path edges. We use the notation \(v \xrightarrow {\pi } v^{\prime }\) to denote that a (directed) path π starts in v and ends in v, and \(v \xrightarrow {} v^{\prime }\) to denote a connection via a directed edge. We use the notation vπ and ρπ to state that a vertex vV (π) and that a path ρ is a subgraph of π, respectively. We also denote the length of a path π as |π|. Throughout the paper, when we write “at time t”, what we formally mean is “in Gint t ”.

The next thing we define is a set of vertices \(\mathcal {D}_{t}\) called dead at time t. The set \(\mathcal {D}_{t}\) is defined as the set of vertices in Gint t that cannot reach St− 1 via a directed path in Gint t . Observe, that if at some point there is no directed path from a vertex to a seed, never again there will be such a path. If a vertex is dead, all vertices reachable from it are dead as well. Hence, no alternating path can enter such a dead region and reorient its edges to make some vertices alive. In other words, \(\mathcal {D}_{t} \subseteq \mathcal {D}_{t + 1}\) for every time moment t. Detailed matching-independent proof of this fact can be found in Section 1.2.2 of [11]. The vertices of \(\mathcal {D}_{t}\) are called dead, while the remaining vertices are called alive.

We now define the effective degree of a black vertex b in turn t as the number of it’s non-dead out-neighbors:

$${\operatorname{degeff}_{t}({b})}=|{\Gamma}_{t}(b) \setminus \mathcal{D}_{t}| $$

where Γ t (b) is the set of vertices v such that \(b \xrightarrow {} v\) in Gint t , referred to sometimes as out-neighbours of b. In particular degeff t(b t ) is the number of all non-dead neighbors (in the undirected sense) of b t , as all the edges adjacent to b t are directed towards its neighbors.

Since we consider in this paper the special case when G t is a tree at any time t, from now on we will refer to G as T, and to G t as T t .

4 Shortest Paths on Trees

In this section we study the shortest augmenting path (Sap) algorithm, which in each turn chooses the shortest among all available augmenting paths. We start by giving an easy argument, that the total length of augmenting paths for Sap is \(\mathcal {O}(n \log n)\) if all vertices b t satisfy degeff t(b t ) > 1. This shows that the difficult case is to deal with vertices of effective degree 1.

Lemma 1

If for each t ∈ [n] it holds that degeff t(b t ) > 1, then the total length of all augmenting paths applied by Sap is bounded by \(\mathcal {O}(n \log n)\).

Proof

Due to the definition of effective degree, every vertex b t connects at least two trees T1 and T2 that contain a directed path connecting b t with a seed. Let T1 be a smaller of the two trees. The length of the shortest path π t from b t to a seed is at most the size of T1. We charge the cost of π t to |π t | arbitrary vertices of T1. During the course of the Sap algorithm, every vertex can be charged at most O(log n) times, as each time it is charged, the size of its tree doubles. The total charge is hence \(\mathcal {O}(n \log n)\). □

The main result of this paper and the subject of the remainder of this section is the bound for the general case, stated in the following theorem.

Theorem 1

The total length of augmenting paths applied by Sap is \(\mathcal {O}(n \log ^{2} n)\).

In order to prove Theorem 1 we introduce a few definitions and observations. The core of our proof is the concept of a dispatching vertex.

Definition 1

A black vertex bB is called dispatching at time t if b is the closest vertex to b t on the path π t that satisfies degeff t(b) > 1. If there is no such vertex at time t we define s t to be the dispatching vertex. We denote a dispatching vertex in time t as dis(π t ). Moreover, for every dispatching black vertex b we define tlast(b) as the moment when b is dispatching for the last time.

So every path π t applied by Sap has a uniquely defined dispatching vertex dis(π t ) assigned to it. The first observation we make is that we only have to care about suffixes of π t ’s starting with dis(π t ).

Definition 2

We split path π t into two segments π t = μ t ρ t , where ρ t is the suffix of π t such that \({\operatorname {dis}({{\pi }_{t}})} \xrightarrow {{\rho }_{t}} s_{t}\). Path μ t = π t ρ t is the remaining part of π t (a possibly empty prefix that ends in a vertex preceding dis(π t )). We refer to the above defined suffixes as dispatching paths.

Lemma 2

The total length of paths μ t is linear in the size of the tree T, i.e.,

$$\textstyle {\sum}_{t \in [n]} |{\mu}_{t}| \in \mathcal{O}(n). $$

Proof

The lemma holds due to Observation 2, proven below, which states that vertices of μ t die at the time t when π t is applied. With this observation it is clear that the time μ t passes through a vertex is the last time Sap visits that vertex. So every vertex in the tree is visited by μ t for any t at most once. □

Observation 2

Vertices of μ t die at the time t when π t is applied.

Proof

At the time when π t is applied, all vertices on μ t have effective degree equal to 1, i.e., they have only one alive directed out-neighbour – their successor on μ t . If we reverse the edges, the only chance for the vertices of μ t to be alive is the last vertex b t . This vertex, however, becomes dead because its only alive out-neighbour is removed. As a consequence the whole path dies. □

To bound the total length of augmenting paths π t , it remains to bound the total length of dispatching paths: \({\sum }_{t \in [n]} |{\rho }_{t}|\). Consider the case when dis(π t ) = s t . Then the path ρ t consists of a single vertex s t . The total sum of paths ρ t satisfying this case is thus \(\mathcal {O}(n)\). It remains to consider the sum over all dispatching paths ρ t that start in a black dispatching vertex. These non-trivial dispatching paths will be, from now on, the focus of our attention. In other words, our goal is to bound the following sum.

Lemma 3

The total length of non-trivial dispatching paths is \(\mathcal {O}(n \log ^{2} n)\) :

$$\sum\limits_{\underset{\operatorname{dis}({{\pi}_{t}}) \in B}{t \in [n]:}} |{\rho}_{t}| \in \mathcal{O}(n \log^{2} n). $$

For the sake of clarity, we split the proof of Lemma 3 into two steps, presented as Lemmas 5 and 6. More precisely, we partition the dispatching paths depending on whether t < tlast(dis(π t )) or t = tlast(dis(π t )), that is, if the dispatching path in question, is the last for its dispatching vertex (cf. Definition 1). In what follows, paths that satisfy the former condition are called non-final and their total length is bounded in Lemma 5, while final paths start at bB when b is a dispatching vertex for the last time; their total length is the subject of Lemma 6.

These results will complete the proof of Theorem 1. However, before we jump into their proofs, we first briefly recall the heavy-light decomposition introduced by Sleator and Tarjan in [16] and state a related technical result, Lemma 4.

For a tree T rooted at r, the original technique partitions its edges into heavy and light, depending on whether the size of the subtree is strictly bigger than half of the size of the subtree rooted at parent. More precisely, let v be any vertex of T other than root r and set p v to be its parent, then an edge {v, p v } is heavy if and only if \(\left | {\mathop {subtree}(v)} \right | > \tfrac {1}{2} \left | {\mathop {subtree}(p_{v})} \right |\), where \(\mathop {subtree}(x)\) is a subtree of T rooted in xV (T). Non-heavy edges are called light.

Observe, that because of the size requirements, each time we traverse a light edge away from the root r, the size of the current subtree halves. In other words, for any vertex v of T there are at most ⌊log 2|T|⌋ light edges on the simple path from r to v. Note that each vertex can have at most two heavy incident edges, thus heavy edges form vertex-disjoint paths. Moreover, paths are of much simpler structure than arbitrary trees, hence allow for more efficient handling despite being possibly numerous.

For convenience, in this paper, we use a slightly modified version, that is, each non-leaf node selects exactly one heavy edge – the edge to the child that has the greatest number of descendants (breaking ties arbitrarily). In particular an edge may be considered heavy, even if the subtree is strictly smaller than half of the size of the current tree. Just like in the original technique, the selected edges form the paths of the decomposition (each non-leaf vertex has at least one and at most two heavy edges), which we call heavy paths. By heavy-path(v) we denote the heavy path to which vertex v belongs, while \(\text {level} : V(T) \to \mathbb {N}\) is the number of light edges on the simple path from a vertex to the root. Observe that level(v) ≤ ⌊log 2|T|⌋ for any vertex v of T.

Lemma 4

Let T be any unrooted tree of size n. For any vertex v let \(S^{v} = \langle {S^{v}_{0}}, {S^{v}_{1}}, {\ldots } \rangle \) be the sequence of subtrees of v (i.e., the connected components of T ∖{v}) ordered descending by their size, that is, |V (Siv)|≥|V (Si+ 1v)|. Then for:

$$\textstyle {\Psi}(v) = {\sum}_{i = 2}^{|S^{v}|-1} |V({S^{v}_{i}})|, $$

we have \({\sum }_{v \in V(T)} {\Psi }(v) \in \mathcal {O}(n \log n)\).

Proof

Let r be a centroid point of T, that is, a vertex such that \(|V({S^{r}_{0}})|\leq \frac {1}{2} |V(T)|\). We root T at r, and perform the heavy-light decomposition of T. Observe that for all vertices vr we have that \({S^{v}_{0}}\) contains r (it corresponds to the parent of v) and \({S^{v}_{1}}\) corresponds to the biggest child of v. In other words, at most \({S^{v}_{0}}\) and \({S^{v}_{1}}\) can be connected by heavy edges, all the other subtrees \({S^{v}_{2}}, {S^{v}_{3}}, \ldots \) are connected by light edges.

Now we take an arbitrary vertex w and calculate how many times it can appear in \({\sum }_{v \in V(T)} {\Psi }(v)\). Suppose v is a vertex that counts w in Ψ(v), then the first edge on the path from v to w has to be light. Moreover, \({S^{v}_{0}}\) is not counted in Ψ(v), so that path cannot pass through the parent of v. Because of that v has to be an ancestor of w. However, there are at most \(\mathcal {O}(\log n)\) light edges on any path from w to the root r for any w. In other words, there can be at most \(\mathcal {O}(\log n)\) vertices that count w in its sum of Ψ. Summing that for all vertices of T we get the desired bound of \(\mathcal {O}(n \log n)\). □

With the help of Lemma 4 we can tackle the first part of Lemma 3, that is, the sum of the lengths of non-final dispatching paths.

Lemma 5

The total length of non-final dispatching paths is \(\mathcal {O}(n \log n)\) :

$$\sum\limits_{\underset{\underset{t < {\operatorname{tlast}({b})}}{b=\operatorname{dis}({{\pi}_{t}})\in B}}{{t \in [n]:}}} |{\rho}_{t}| \in \mathcal{O}(n \log n) $$

Proof

Recall that the path π t starts in the newly added vertex b t . So in turn t either b t is dispatching, or it dies. At any later time t > t at which b t is dispatching \({\pi }_{t^{\prime }}\) does not begin with b t and hence one of b t ’s neighbours dies based on Observation 2.

Consider a fixed vertex b and let \(W^{\mathcal {D}}_{b} \subseteq W\) be the set of neighbors of b that die in turns when b is dispatching. The first time b is dispatching no neighbour of b dies, so \(\mathcal {D}_{t} \cap W^{\mathcal {D}}_{b}=\emptyset \). The second time b is dispatching, it has at least one dead neighbour and set \(\mathcal {D}_{t} \cap W^{\mathcal {D}}_{b}\) has exactly one element, namely the white vertex that preceded b on μ t . More generally, the k-th time b is dispatching, \(\mathcal {D}_{t} \cap W^{\mathcal {D}}_{b}\) has k − 1 elements. Suppose that the total number of times b is dispatching equals l, in particular we know that at some point of time \(\mathcal {D}_{t} \cap W^{\mathcal {D}}_{b}\) will have l − 1 elements. When b is dispatching for the k-th time set \(\mathcal {D}_{t} \cap W^{\mathcal {D}}_{b}\) has only k − 1 members. In other words, b has lk neighbors which are at that turn not yet in \(\mathcal {D}_{t} \cap W^{\mathcal {D}}_{b}\), and thus alive. Furthermore b has at least two white neighbors that do not belong to \(W^{\mathcal {D}}_{b}\) and are alive at the time when b is dispatching for the last time. Therefore, in total b has at least lk + 2 alive white out-neighbours.

We say that a subtree hangs from the neighbour w of b, if it is obtained by the removal of b from T and it contains w. Suppose that we discard two neighbors of b with the heaviest trees hanging from them, i.e., two heaviest neighbours. Then for k = l − 1 we have at least one alive neighbor, for k = l − 2 we have at least two alive neighbors, that is, at least one alive neighbor other than the neighbor used at k = l − 1, and so on. In other words, for any k < l we can find a distinct, not already assigned, alive neighbor w different than the two heaviest neighbors of b. However, the size of the subtree hanging from that neighbour bounds the length of the shortest augmenting path starting at b. Therefore, we can bound the total length of non-final paths dispatching at b by the total size of all subtrees of b except the two heaviest. Summing that up over the whole tree gives us a \(\mathcal {O}(n \log n)\) upper bound, as shown by the previous lemma, Lemma 4. □

We now move on to the second part of Lemma 3, namely we bound the total length of final dispatching paths.

Lemma 6

The sum of lengths of ρ t such that t = tlast(dis(π t )) is bounded from above by \(\mathcal {O}(n\log ^{2}n)\):

$$\sum\limits_{\underset{\underset{t={\operatorname{tlast}({b})}}{b={\operatorname{dis}({{\pi}_{t}})}\in B}}{t \in [n]:}} |{\rho}_{t}| \in \mathcal{O}(n \log^{2} n).$$

To prove the above statement we will need a more fine-grained analysis than before. The problem with the shortest path approach is that its structure and the structure of matchings are very different. To close this gap we introduce yet another family of augmenting paths that relies much more on the structure of the tree. Obviously, because the shortest paths are shorter than any other path, any upper bound on the total length of the aforementioned new family of augmenting paths is an upper bound for the shortest paths as well.

Proof

Similarly to Lemma 4, we root T at a centroid point and preform the heavy-light decomposition of the tree. That is, each vertex selects an edge to its largest subtree, which we call heavy, while all other edges are considered light.

We define λ t as one of the Mt− 1-augmenting paths that connect the newly-added vertex b t to an unmatched white vertex. To be more precise, for each path λ connecting b t with a free white vertex in turn t let the tuple

$$\left\langle {\text{level}(v_{0}),\ \text{level}(v_{1}),\ \text{level}(v_{2}),\ \ldots} \right\rangle$$

represent λ, where v0,v1,v2,… is the sequence of vertices of λ. We define λ t as the path represented by the lexicographically last tuple. Observe that λ t leaves any heavy path as soon as possible.

Now note that all Mt− 1-augmenting paths starting in b t are the same up to dis(π t ) and let dis(λ t ) = dis(π t ). To bound the length of λ t , we split it into three parts as follows:

$$\begin{array}{@{}rcl@{}} {\lambda}^{\prime}_{t} &=& {\mu}_{t},\\ {\lambda}^{\prime\prime}_{t} &=& ({\lambda}_{t}\setminus{\lambda}^{\prime}_{t})\cap\text{heavy-path}(\operatorname{dis}({\lambda_{t}})),\\ {\lambda}^{\prime\prime\prime}_{t} &=& ({\lambda}_{t}\setminus{\lambda}^{\prime}_{t})\setminus\,\text{heavy-path}({\operatorname{dis}({{\lambda}_{t}})}). \end{array} $$

Then \({\lambda }^{\prime \prime }_{t}\) follows heavy-path(dis(λ t )) at most up to the closest vertex b such that tlast(b) > t, namely:

$$ \left| {{\lambda}^{\prime\prime}_{t}} \right| \leq \min\left\{\operatorname{dist}(b, {\operatorname{dis}({{\lambda}_{t}})}) \ \left|\right.\ b \in B\left( \text{heavy-path}({\operatorname{dis}({{\lambda}_{t}})})\right), {\operatorname{tlast}({b})} > t \right\}. $$
(1)

The reason for this is that any vertex b with tlast(b) > t has at least three alive neighbors, with at least two of them reachable from b in Gint t , and at least one not on heavy-path(dis(λ t )). Any such b is a vertex at which \({\lambda }^{\prime \prime }_{t}\) can leave heavy-path(dis(λ t )).

Consider an arbitrary heavy path H and the set D of vertices on H that are ever dispatching in the entire run of the algorithm. Using D0 = D we split H into at least d0 = |D0| non-empty fragments \(h_{0}, h_{1},\ldots ,h_{d_{0} - 1}\). Each such part \(h\in \{h_{0}, h_{1}, \ldots , h_{d_{0}-1}\}\) has at least one of its endpoints in D0, and usually both, unless h is the first or the last fragment. Thus, we can assign h to one of its ending vertices with preference for earlier turn tlast if there are two available. Formally \(f_{0} : \{h_{0}, h_{1}, \ldots , h_{d_{0}-1}\} \to D_{0}\), where:

$$f_{0}(h) = \underset{b\;\in\,V(h)\,\cap\,D_{0}}{\text{arg min}} {\operatorname{tlast}({b})}. $$

Due to Inequality (1) in the previous paragraph we have:

$$\left| {{\lambda}^{\prime\prime}_{{\operatorname{tlast}({f_{0}(h_{i})})}}} \right| \leq \left| {h_{i}} \right| \text{ for any } 0 \leq i < d_{0}.$$

This means that the length of H bounds the total length of λ’s related to the dispatching vertices in the image of f0. However, as at most two h i ’s can be assigned to the same vertex of D0, the image of f0 constitutes at least a half of D0.

To take care of the rest of D0, we iterate this reasoning. We construct a sequence of sets D0D1 ⊇…, each step halving the size of D i . More precisely, we set:

$$D_{i} = D_{i-1} \setminus { \left\{ {f_{i-1}\left( {h^{i-1}_{j}} \right) \ \left|\right.\ 0 \leq j < d_{i-1}} \right\} }, $$

where \({h^{i}_{0}},{h^{i}_{1}},\ldots ,h^{i}_{d_{i}-1}\) are the parts of H after the split by D i and functions \(f_{i} : \{{h^{i}_{0}},{h^{i}_{1}},\ldots ,h^{i}_{d_{i}-1}\} \to D_{i}\) are defined as:

$$f_{i}(h^{i}) = \underset{b\;\in\,V(h^{i})\,\cap\,D_{i}}{\text{arg min}} {\operatorname{tlast}({b})}. $$

In other words, at most log n copies of H cover all λ paths related to H. Summing this up over all heavy paths gives us:

$${\sum}_{\underset{\underset{t={\operatorname{tlast}({b})}}{b={\operatorname{dis}({{\pi}_{t}})}}\in B}{t \in [n]:}}\left| {{\lambda}^{\prime\prime}_{t}} \right| \in \mathcal{O}(n\log n). $$

Furthermore, it also means that for any vV (H) at most log n of λ″′ paths may start in v. That is, log n copies of all non-heavy subtrees of vV (H) cover all λ″′ paths starting in v, which, by Lemma 4, implies

$${\sum}_{\underset{\underset{t={\operatorname{tlast}({b})}}{b={\operatorname{dis}({{\pi}_{t}})}\in B}}{t \in [n]:}}{ \left| {{\lambda}^{\prime\prime\prime}_{t}} \right| } \in \mathcal{O}(n\log^{2}n). $$

From the last two bounds we infer the statement of the Lemma 6. This also completes the proof of Theorem 1. □

5 Playing Against an Adversary

In the last section of this paper we discuss a quite surprising characteristic of Theorem 1 and its implications. Namely, nowhere in the proofs of Lemmas 3, 4, 5 and 6 we rely on the shape of any particular matching at any given turn, or even on the fact that these matchings are related to each other. To be more specific, we depend only on the structure of the tree, the properties of the dead and alive vertices, and the cardinality of the matchings. This leads us to a generalization of the setting in question and a respective counterpart of Theorem 1.

We define the adversarial dynamic augmenting path setting as a setup similar to the one from Section 3 in which, as before, each turn we are given a single black vertex with all its edges. However, the matching we use to calculate the shortest augmenting path is not the one produced by the algorithm in the previous turn, but some arbitrary matching of the same cardinality provided by the adversary. In particular, the edges might be oriented, wherever possible, away from the newly added vertex, thus making the augmenting paths the longest possible. Nonetheless, because we do not depend on the structure of the matching, the total length of all such augmenting paths is still small.

Corollary 1

If the graph in the above setting is a tree, then the total length of all the shortest augmenting paths is \(\mathcal {O}(n\log ^{2}n)\).

It seems that this is true also in general bipartite graphs, and thus we form the following conjecture.

Conjecture 1

The total length of all the shortest augmenting paths in the setting above, that is, with the matching changing arbitrarily each turn, is still \(\mathcal {O}(n \log n)\) worst case for any bipartite graph.

The ramifications of that conjecture are twofold. First, it suggests a new perspective and a new research angle in which we are allowed to change the matching to fit into some schema. That could possibly lengthen the paths in the process, but it might make the problem a bit more predictable and less dynamic, hence, in some aspects, easier. Second, it might allow for better algorithms. A matching procedure based on the above idea could alter the calculated matching during some turns in a random way, thus perhaps making its worst case less bad. As the reasons behind this phenomenon are far from clear, in authors’ opinion Conjecture 1 is an interesting open problem.