Randomized broadcast in radio networks with collision detection

Ghaffari, Mohsen; Haeupler, Bernhard; Khabbazian, Majid

doi:10.1007/s00446-014-0230-7

Randomized broadcast in radio networks with collision detection

Published: 09 October 2014

Volume 28, pages 407–422, (2015)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Distributed Computing Aims and scope Submit manuscript

Randomized broadcast in radio networks with collision detection

Download PDF

Mohsen Ghaffari¹,
Bernhard Haeupler² &
Majid Khabbazian³

270 Accesses
20 Citations
Explore all metrics

Abstract

We present a randomized distributed algorithm that in radio networks with collision detection broadcasts a single message in \(O(D + \log ^6 n)\) rounds, with high probability. This time complexity is most interesting because of its optimal additive dependence on the network diameter \(D\). It improves over the currently best known \(O(D\log \frac{n}{D}\,+\,\log ^2 n)\) algorithms, due to Czumaj and Rytter (Broadcasting algorithms in radio networks with unknown topology. In: Proceedings of the symposium on foundations of computer science, pp 492–501, 2003), and Kowalski and Pelc (Broadcasting in undirected ad hoc radio networks. In: Proceedings of the ACM SIGACT-SIGOPS symposium on principles of distributed computing, pp 73–82, 2003). These algorithms where designed for the model without collision detection and are optimal in that model. However, as explicitly stated by Peleg in his 2007 survey on broadcast in radio networks, it had remained an open question whether the bound can be improved with collision detection. We also study distributed algorithms for broadcasting \(k\) messages from a single source to all nodes. This problem is a natural and important generalization of the single-message broadcast problem, but is in fact considerably more challenging and less understood. We show the following results: If the network topology is known to all nodes, then a \(k\)-message broadcast can be performed in \(O(D + k\log n + \log ^2 n)\) rounds, with high probability. If the topology is not known, but collision detection is available, then a \(k\)-message broadcast can be performed in \(O(D + k\log n + \log ^6 n)\) rounds, with high probability. The first bound is optimal and the second is optimal modulo the additive \(O(\log ^6 n)\) term.

Distributed Randomized Broadcasting in Wireless Networks under the SINR Model

Exactly Optimal Deterministic Radio Broadcasting with Collision Detection

Fast Structuring of Radio Networks Large for Multi-message Communications

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The classical information dissemination problem in radio networks is the problem of broadcasting a single message to all nodes of the network (single-message broadcast). This problem and its generalizations have received extensive attention.

A characteristic of radio networks is that multiple messages that arrive at a node simultaneously interfere (collide) with one another and none of them is received successfully. Regarding whether nodes can distinguish such a collision from complete silence, the model is usually divided into two categories of with and without collision detection. Throughout studies of problems in radio networks, it has been observed that many problems can be solved faster in the model with collision detection [21]. Despite this trend, it had remained unclear whether this is also the case for broadcast or not [20]. We show that single-message broadcast can be indeed solved faster, in simply diameter plus poly-logarithmic time, if collision detection is available. Even though our work is theoretical, we remark that most practical radio networks can detect collisions.

Broadcasting \(k\) messages from one node to all nodes is a natural and important generalization of the single-message broadcast problem. Usually, this generalization involves new and significantly different challenges, mainly because the dissemination of different messages can interfere with each other. We show how to overcome these challenges and obtain an (almost) optimal \(k\)-message broadcast algorithm.

1.1 Model and problem statements

We work in the radio network model with collision detection [5]: a synchronous undirected network \(G=(V,E)\) where in each round, each node either transmits a packet with \(B\) bits or listens. As a standard assumption, to ensure that each packet can contain a constant number of ids, we assume that \(B=\varOmega (\log n)\). Each node \(v\) receives a packet from its neighbors only if it listens in that round and exactly one of its neighbors is transmitting. If two or more neighbors of \(v\) transmit, then \(v\) only detects the collision, which is modeled as \(v\) receiving a special symbol \(\top \) indicating a collision. We explain that some of our results hold even in the model without collision detection, where if two or more neighbors of \(v\) transmit, then \(v\) does not receive anything.

The single-message broadcast problem is defined as follows: A single source node has a single message of length at most \(\varTheta (B)\) bits and the goal is to deliver this message to all nodes in the network. The \(k\)-message single-source broadcast problem is defined similarly, with the difference that the source has \(k\) messages which need to be delivered to all other nodes. We focus on randomized solutions to these problems where we require that the message(s) are delivered to all nodes with high probability.^{Footnote 1} In the unknown topology setting (which is our default setting), we assume^{Footnote 2} that nodes know a polynomial upper bound on \(n\) and a constant factor upper bound on diameter \(D\). In the known topology setting, similar to [8], we assume that nodes know the whole graph.

1.2 Our results

Our main results are as follows:

Theorem 1

In radio networks with unknown topology and with collision detection, there is a randomized distributed algorithm that broadcasts a single message in \(O(D + \log ^6 n)\) rounds, with high probability.

Theorem 2

In radio networks with known topology (even without collision detection), there is a randomized distributed algorithm that broadcasts \(k\) messages in \(O(D + k \log n + \log ^2 n)\) rounds, with high probability.

Theorem 3

In radio networks with unknown topology and with collision detection, there is a randomized distributed algorithm that broadcasts \(k\) messages in \(O(D + k \log n + \log ^6 n)\) rounds, with high probability.

About Theorem 1, we remark that prior to this work, the best known solution for single-message broadcast was the \(O(D\log {n/D}+\log ^2 n)\) algorithms presented independently by Czumaj and Rytter [7], and Kowalski and Pelc [16], for the model without collision detection. In that model, these bounds are optimal [1, 18]. As Peleg points out in [20], prior to this work, it was unclear whether these upper bounds can be improved in the model with collision detection. Theorem 1 answers this question by showing that a better upper bound is indeed achievable. We remark that the bound of Theorem 1 is within an additive poly-log of the \(\varOmega (D+\log ^2 n)\) lower bound, that follows from the \(\varOmega (\log ^2 n)\) lower bound of [1] and the obvious lower bound of \(\varOmega (D)\).

About Theorems 2 and 3, we remark that these two results use random linear network coding (RLNC). Moreover, we note that even in the strong model of centralized algorithms with full topology knowledge, with collision detection, and with network coding, \(k\)-message broadcast has a lower bound of \(\varOmega (D+k\log n+ \log ^2 n)\) rounds. This lower bound follows from the \(\varOmega (k \log n)\) throughput-based lower bound of [2] for a \(k\)-message broadcast, the \(\varOmega (\log ^2 n)\) lower bound of [1] for a single message broadcast, and the trivial \(\varOmega (D)\) lower bound. Thus, the complexity of Theorem 2 is optimal and the complexity of Theorem 3 is optimal modulo the additive \(O(\log ^6 n)\) term.

When looking at the issue from a practical angle, Theorems 1 and 3 have an interesting message: they show that one can replace the (expensive and not-completely-reasonable) assumption of all nodes knowing the full topology of the network, with (the considerably more reasonable and usually-available) collision detection, and still perform single or multiple broadcast tasks almost in the same time.

To achieve the above three results, we present three new technical elements, which each can be interesting on their own:

(A)
The first element is a distributed construction of a gathering-spanning-tree (GST) with round complexity of \(O(D\log ^4 n)\). GSTs were first introduced by [8] to obtain broadcast algorithms with an additive \(O(D)\) diameter dependence in the known topology setting [8, 9, 19]. The only known construction of GST prior to this work was the centralized algorithm of Gasieniec et al. [8], which has step-complexity of \(O(n^2)\) operations and requires the full knowledge of the graph. We use our new GST construction to prove Theorem 1. For this we first decompose the graph appropriately, then we construct a GST for every part in parallel and lastly we use this setup to broadcast the (single) message efficiently.
(B)
The second element is a novel transmission schedule atop GST for solving multiple message broadcast problems. We contend this schedule to be the right generalization of [8] for multiple messages. Such a generalization was also attempted in [19] but its correctness was disproved [22].
(C)
The third element is backwards analysis, an new way to analyze the progress of messages during a multi-message radio network broadcast. Backward analysis shows that a message spreads quickly even when other messages that are spread at the same time cause collisions. A priori it is not clear that information dissemination remains efficient in the presence of these collisions, which only arise in the multi-message setting. Insights from the backwards analysis were crucial in the design of our multi-message transmission schedule and also enable us to apply the projection analysis of Haeupler [12] for analyzing random linear network coding to prove Theorems 2 and 3.

1.3 Related work

Designing distributed broadcast algorithms for radio networks has received extensive attention, starting with the pioneering work of Bar-Yehuda, Goldreich and Itai (BGI) [3]. Here, we present a brief review of the results that directly relate to this paper.

Single-message broadcast Peleg [20] provides a comprehensive survey of the known results about single-message broadcast. BGI [3] present the Decay protocol which broadcasts a single message in \(O(D\log n+ \log ^2 n)\) rounds. The best known distributed algorithms for single-message broadcast for the setting where the topology is unknown are the \(O(D\log {\frac{n}{D}}+\log ^2 n)\) algorithms presented independently by Czumaj and Rytter [7], and Kowalski and Pelc [16]. These algorithms can be viewed as clever optimizations of the Decay protocol [3]. Moreover, similar to the Decay protocol, these two algorithms are presented for the model without collision detection and are optimal in that model [1, 18]. Prior to this work, no better algorithm was known for the model with collision detection. If the topology of the network is known, then the algorithm of Gasieniec et al. [8] achieves the optimal \(O(D + \log ^2 n)\) time complexity. Kowalski and Pelc [17] gave an explicit deterministic broadcast protocol which achieves the same time complexity.

Multi-message broadcast The complexity of multi-message broadcast (with bounded packet size) is less understood. In the model without collision detection, the following results are known. The earliest work on multi-message broadcast problem is by BarYehuda et al. [4], which broadcasts \(k\) messages in \(O((n + (k + D) \log n) \log \varDelta )\) rounds, where \(\varDelta \) is the maximum node degree. Chlebus et al. [6] present a deterministic algorithm that broadcasts \(k\) messages in \(O(k\log ^3 n + n\log ^4 n)\) rounds. Khabbazian and Kowalski [15] and Ghaffari and Haeupler [10] give randomized algorithm that reduce the dependency on \(k\) to \(O(k \log n)\) using coding techniques. Ghaffari et al. [2] give an \(\varOmega (k \log n)\) lower bound which shows that this throughput is optimal and furthermore study whether coding is necessary to achieve this throughput. The randomized algorithms of [15] and [10] broadcast \(k\) messages in \(O(k\log \varDelta +(D+\log n)\log n\log \varDelta )\) rounds and \(O(k\log \varDelta +(D+\log n)\log n\log \varDelta )\) rounds respectively. Again, prior to this work, no better algorithm was known for the model with collision detection.

2 Single-message broadcast

We first recall the definition of a GST [8], in Sect. 2.1. Then, in Sect. 2.2, we present a distributed algorithm with time complexity \(O(D \log ^4 n)\) for constructing a GST, in radio networks with unknown topology (even without collision detection). In Sect. 2.3, we then show that this algorithm can be used to broadcast a single message in \(O(D+ \log ^6 n)\) rounds, in radio network with unknown topology but with collision detection.

2.1 Gathering spanning trees (GST)

Ranked BFS Consider a BFS tree \(\mathcal {T}\) in graph \(G\), rooted at source node \(s\). Also, suppose that in this tree, we have assigned to each node \(v\) a level number \(\ell (v)\), which is equal to the distance of \(v\) from \(s\). We rank the nodes of \(\mathcal {T}\) using the following inductive ranking rule: Each leaf of \(\mathcal {T}\) gets rank \(1\). Then, consider node \(v\) and suppose that all children of \(v\) in \(\mathcal {T}\) are already ranked. Let \(r\) be the maximum rank of these children. If \(v\) has exactly one child with rank \(r\), then node \(v\) gets rank \(r\). If \(v\) has two or more children with rank \(r\), then \(v\) gets rank \(r+1\). As shown in [8], one can easily see that in each ranked BFS, the largest rank is at most \(\lceil \log _{2} n \rceil \).

Gathering spanning tree (GST) [8] A ranked BFS-tree \(\mathcal {T}\) is called a GST of graph \(G\) if and only if the following collision-freeness property is satisfied:

Fast stretches in a GST In a GST \(\mathcal {T}\), for each path in \(\mathcal {T}\) from a node \(v\) to a node \(u\) that is a descendant of \(v\) in \(\mathcal {T}\), we call this path a fast stretch if all the nodes on the path have the same rank. Note that a fast stretch might be just a single node.

Distributed GST In a distributed construction of a GST, each node \(u\) must learn the following four items^{Footnote 3}: (1) its level \(\ell (u)\), (2) its own rank \(r(u)\), (3) the id of its parent \(v\), and (4) the rank of its parent \(r(v)\).

Figure 1 presents an example of a GST. The black edges present the graph \(G\) and the thicker green edges present a rank labeled BFS tree \(\mathcal {T}\) of \(G\). On the left side, we see a rank-labeled BFS tree, but this tree is not a GST because of the violation of the collision-freeness property indicated by the red dashed arrow. On the right side, we see another rank-labeled BFS of the same graph \(G\), which is a GST. In this GST, the green edges that are coated with wide blue lines indicate the fast stretches. Each node that is not incident on any of these blue-coated edges forms a trivial fast-stretch made of just a single node.

Broadcast atop GST In [8], Gasieniec et al. presented an algorithm to broadcast a single message in \(O(D+\log ^2 n)\) rounds, atop a GST. A high-level explanation is as follows: with a careful timing, the message can be sent through the fast stretches without any collision. That is, we can (almost simultaneously) send the message through different stretches such that in each fast stretch, the message gets broadcast from the start of the stretch to the end of the stretch in a time asymptotically equivalent to the length of the stretch. On the other hand, since the largest rank in the tree \(\mathcal {T}\) is at most \(\lceil \log _{2} n \rceil \) and because on each path from the source to any node \(v\), the ranks are non-increasing, we get that the path from the source to each node \(v\) is made of at most \(\lceil \log _{2} n \rceil \) distinct fast stretches. By using the Decay protocol ^{Footnote 4} [3] on each of the (at most) \(\lceil \log _{2} n \rceil \) connections between the fast stretches, we get a broadcast algorithm with time complexity \(O(D + \log ^2 n)\). We refer the reader to [8] for the details of this broadcast algorithm. We remark that we will use [8] simply as a black-box that broadcasts a single-message in time \(O(D + \log ^2 n)\) on top of the GSTs we construct.

2.2 Distributed GST construction

In this subsection, we present the following result:

Theorem 4

In the radio networks (even without collision detection), there exists a distributed GST construction algorithm with time complexity \(O(D \log ^4 n)\) rounds.

We show a GST construction with round-complexity of \(O(D \log ^5 n)\) in Sects. 2.2.1–2.2.3. We later improve this to \(O(D \log ^4 n)\) rounds, in Sect. 2.2.4.

2.2.1 Black-box tools

Before starting the construction, we first present two black-box tools which we use in our construction.

Decay protocol[3] Rounds are divided into phases of \(\log n\) rounds, and in the \(i\)th round of each phase, each node \(v\) transmits with probability \(2^{-i}\) (if it has a message for transmission).

Lemma 1

[3] For each node \(v\), if at least one neighbor of \(v\) has a message for transmission, then in each phase of the Decay protocol, node \(v\) receives at least one message with probability at least \(\frac{1}{8}\). Moreover, in \(\varTheta (\log n)\) such phases, \(v\) receives at least one message, with high probability.

Recruiting protocol This tool can be abstracted by the guarantees that it provides, which we present in Lemma 2.

Lemma 2

Consider a bipartite graph \(\mathcal {H}\) where nodes on one side are called red and nodes on the other side are called blue. The recruiting protocol achieves the following three properties, w.h.p., in \(\varTheta (\log ^3 n)\) rounds: (a) for each blue node \(u\), we assign an adjacent red node \(v\) to \(u\). In this case, we say \(u\) is recruited by \(v\) (then called parent of \(u\)), (b) each red node \(v\) knows whether it recruited zero, one, or at least two blue nodes, (c) each recruited blue node \(u\) knows whether its parent \(v\) recruited zero, one, or at least two blue nodes.

Proof of Lemma 2

We show that each blue node is recruited with high probability. The other parts follow easily from the description of the algorithm.

Consider an arbitrary blue node \(u\). It is easy to see that there are \(\varTheta (\log n)\) iterations such that in the first round of each of these iterations, \(u\) receives the message of a red node. This is because, for each \(j\)th iteration where \(j\) is such that \(2^{j} \in [\frac{d(u)\,\cdot \,\log n}{2},\; 2d(u)\,\cdot \,\log n]\), and where \(d(u)\) denotes the degree of \(u\) in \(\mathcal {H}\), node \(u\) receives a message in the first round of iteration \(j\) with constant probability. A Chernoff bound then shows that in \(\varTheta (\log n)\) of these iterations, in the first round, \(u\) receives the message of a red node.

Consider one such recruiting iteration, and suppose that in the related first round, \(u\) receives the message of red node \(v\). In the \(\varTheta (\log n)\) rounds of the Decay phase of that iteration, from the properties of the Decay protocol, we get that with constant probability, the red node \(v\) either receives the message of \(u\) or it receives at least two messages from blue nodes. Moreover, if \(v\) receives a message from a blue node \(w\), then \(w\) had received the message of node \(v\) in the first round of this iteration. This is because, since \(v\) transmitted in that round, \(w\) could not have received from any other red node \(v'\) and since \(w\) is transmitting in the decay, we know that it has received the message of one red node. Thus, we conclude that with constant probability, the red node \(v\) receives either the message of \(u\) or at least two messages from blue nodes. In either case, \(u\) gets recruited. Note that \(u\) received the message of \(v\) in the last round of the iteration simply because this round is an exact repetition of the transmission of the first round of this iteration, where \(u\) received a message from \(u\).

Now in \(\varTheta (\log ^2 n)\) recruiting iterations, there are \(\varTheta (\log n)\) iterations where in their first round, \(u\) receives the message of a red node. Since in each such iteration \(u\) is recruited with a constant probability, we get that after the full run of the Recruiting protocol, \(u\) is recruited with high probability. \(\square \)

2.2.2 GST construction outline

We first construct a BFS-tree of \(G\) and assign to each node \(v\) a level \(\ell (v)\) that is equal to the distance of \(v\) from the source. This can be done in \(O(D\log ^2 n)\) rounds, as follows: Rounds are divided into \(D\) epochs each consisting of \(\varTheta (\log n)\) phases of the Decay protocol (thus, each epoch has \(\varTheta (\log ^2 n)\) rounds). In each epoch, a node \(v\) participates in the decays if and only if it is the source or it has received a message by the end of the last epoch. During these rounds, each node relays the first message it received. The epoch in which a node \(v\) receives a message for the first time determines the BFS level \(\ell (v)\) of node \(v\).

Now that we have a BFS-tree, we build the GST on top of this BFS layering, level by level, and from the largest level towards the source. For this, the problem boils down to the following scenario: Consider level \(l\) of layering and assume that the GST is already built for levels \(j \ge l\). Consider the bipartite graph \(H\) induced on the nodes of level \(l-1\) and level \(l\), ignoring the (possible) edges inside each level. The core of the problem is to design an algorithm to construct the part of GST between levels \(l-1\) and \(l\), i.e., the part that is \(H\).

Let us call the nodes on level \(l-1\) red nodes, and the nodes on level \(l\) blue nodes. To construct the part of GST that is in \(H\), we assign a red parent \(v\) to each blue node \(u\), from amongst the red neighbors of \(u\) in \(H\). In this case, \(v\) is known as \(u\)’s parent and \(u\) is a child of \(v\). This assignment, along with the rankings of blue nodes, leads to a ranking for the red nodes. More precisely, let \(v\) be a red node and let \(i\) be the maximum rank of blue node children of \(v\) in the assignment. Node \(v\) gets rank \(i\) if it has only one child with rank \(i\), and \(v\) gets rank \(i+1\) if it has more than one child with rank \(i\).

To have a GST, these assignments should be collision-free. That is, if there exist blue nodes \(u_1\) and \(u_2\) and their respective parents \(v_1\) and \(v_2\), all four with rank \(i\), then \(H\) must have no edge between \(v_1\) and \(u_2\), or between \(v_2\) and \(u_1\). Mathematically, if we let \(\mathcal {M}\) be the set of edges between blue nodes \(u\) of rank \(i\) and their respective red parents \(v\) with rank \(i\), then \(\mathcal {M}\) should be an induced matching of graph \(H\). We refer to the problem of finding such an assignment as the bipartite assignment problem.

More precisely, in the bipartite assignment problem, we should achieve the following six properties: (1) For each blue node \(u\), we should assign a red neighbor \(v\) as its parent, (2) we should rank the red nodes as follows: for each red node \(v\), suppose \(i\) is the maximum rank of the children of \(v\). Then, \(v\) should get rank \(i\) if \(v\) has exactly one blue child of rank \(i\), and \(v\) should receive rank of \(i+1\) if \(v\) has two or more blue children of rank \(i\), (3) the assignment should be collision-free, (4) each red node must know its rank and (5) each blue node \(u\) should know the id of its parent and (6) each blue node \(u\) should know the rank of its parent.

The bipartite assignment problem is the core of the GST construction and once we have a solution for it, repeating the solution level by level from the largest level to source constructs a GST. In the next subsection, we explain how to solve this problem in \(O(\log ^5 n)\) rounds.

2.2.3 The bipartite assignment algorithm

Consider bipartite graph \(H\) as explained. We solve the bipartite assignment problem (defined above) in \(H\) in a rank by rank basis, starting with the largest possible rank \(\lceil \log n \rceil \) (of blue nodes), and going down in ranks until reaching rank \(1\). We spend \(\varTheta (\log ^4 n)\) rounds on each rank. Let us consider the case of a bipartite assignment for blue nodes of rank \(i\) in graph \(H\), assuming that ranks greater than \(i\) are already solved.

We first identify the red neighbors of the blue nodes with rank \(i\). This is done by using \(\varTheta (\log n)\) phases of the Decay protocol where blue nodes of rank \(i\) transmit. This identifies the desired red nodes as every such red node receives at least one message with high probability and no other red node receives any message. From now on, throughout the procedure for rank \(i\), only these red nodes are active. Now the algorithm is divided into \(\varTheta (\log n)\) epochs. Each epoch consists of three stages as follows:

Stage I Call a blue node \(u\) of rank \(i\) a loner if \(u\) has exactly one active red neighbor. We first detect the loner blue nodes. For this, in one round, each active red node transmits a message. Only loner blue nodes receive a message and each other blue node receives a collision. We then use \(\varTheta (\log n)\) phases of the Decay protocol, where each blue loner tries transmitting. This with high probability informs all red nodes that are connected to at least one loner blue node. We call these red nodes loner-parents.
Stage II This stage is divided into three parts, and each red node is active in only one of the parts. Loner-parents, which we identified in the stage I, are active only in part 1. Each other active red node randomly and uniformly decides to be either brisk or lazy, which respectively mean it is active in part 2 or in part 3. These parts are as follows:
- Part 1 Loner-parents use a recruiting protocol. During this recruiting protocol, each blue neighbor of each red loner-parent get recruited with high probability. These assignments are permanent. All the blue nodes that are recruited become inactive for the rest of the assignment problem.
- Part 2 Brisk red nodes run a Recruiting protocol. Then, each blue node that is not the only recruited child of its parent considers its parent as its permanent GST parent and becomes inactive permanently (for the GST construction). The other recruited blue nodes become inactive only for the remainder of this epoch, but these assignments are temporary and the related nodes restart in the next epoch, ignoring their temporary assignments.
- Part 3 We repeat the procedure of part 2, but this time with lazy red nodes and with the active blue nodes that did not get recruited in parts 1 or 2.
Stage III Let us say that a red node is marked if it was a loner-parent or if it recruited zero or strictly more than one blue nodes in parts 2 or 3. Each marked red node becomes inactive after this epoch. Thus, the only red nodes that remain active after this epoch are those that do not have any loner neighbor and recruited exactly one child in part 2 or 3 of the stage II. Each marked red node knows whether it recruited zero, one, or at least two children (in stage II). We use this knowledge to rank these marked red nodes giving them rank of \(i\) if they recruited exactly one blue child and rank of \(i+1\) if they recruited more than one blue child. Blue children of marked red nodes also know that their parents are marked and they can also compute the rank of their parents (refer to property (c) of Lemma 2). Before inactivating the marked red nodes, we do one simple thing: marked red nodes run \(\varTheta (\log n)\) phases of the Decay protocol sending their id and rank. Each blue node of any rank strictly lower than \(i\) that receives a red node id considers the first red node that it heard from as its permanent GST parent, records the id and rank of that red parent, and then, becomes inactive for the rest of the assignment problem.

After running the bipartite assignment algorithm for all the ranks, if a red node \(v\) has no child, then \(v\) is a leaf and in the GST, \(v\) gets rank \(1\).

Figure 2 shows an example of assignments during an epoch (the first epoch). The green arrows in the leftmost part indicate the loner blue nodes at the start of the epoch. The loner parent red nodes are indicated by a number 1 next to them, meaning they are active in part 1. Brisk and lazy red nodes are respectively indicated by numbers 2 and 3, next to them. The smaller nodes present the (temporarily or permanently) deactivated nodes. The green dashed lines show the permanent assignments and the (thicker) orange dashed lines show the temporary assignments. After the end of epoch, nodes with temporary assignment are re-activated. The graph remaining after the first epoch is presented on the right side of the Fig. 2, by solid blue lines.

Analysis In Lemma 3, we prove that in each of the \(\varTheta (\log n)\) epochs except the first one, we reduce the size of the assignment problem for rank \(i\) by at least a constant factor, with at least a positive constant probability. Here, by size of the assignment problem, we mean the number of the active red nodes with a blue neighbor of rank \(i\). A standard Chernoff bound then shows that in \(\varTheta (\log n)\) epochs, each blue node of rank \(i\) has a parent. It is clear that the parents are ranked according to the ranking rules of GST and nodes know their own rank, the id of their parents, and the rank of their parents. We show in Lemma 4 that with high probability, the assignment is collision-free.

Lemma 3

In each epoch \(j' \le 2\), with a probability at least \(1/7\), the number of remaining active red nodes for the next epoch goes down with a factor at least \(8/7\).

Proof

Consider epoch \(j' \ge 2\) and let \(\eta \) be the number of active red nodes at the start of this epoch. We show that the expected number of red nodes that remain active at the end of this epoch is at most \(\frac{3\eta }{4}\). This is enough for the proof because with this, and by Markov’s inequality, we get that with probability at least \(1/7\), the number of active remaining red nodes at the end of this epoch is at most \(\frac{7\eta }{8}\).

Each red node remains active after epoch \(j'\) only if it gets a temporary assignment, i.e., if it is not a loner-parent and it recruits exactly one child during parts 2 and 3 of Stage II. Thus, the expected number of red nodes that remain active is at most equal to the expected of number of brisk red nodes (those that act in part 2) plus the number of blue nodes that are active in part 3. The expected number of brisk red nodes is at most \(\frac{\eta }{2}\). To complete the proof, we show that the expected number of blue nodes that remain active for part 3 (after the assignments of part 2) is at most \(\frac{\eta }{4}\).

After each epoch, the only red nodes that remain active are those that have a temporary assignment, i.e., those that each have recruited exactly one child and that child is not a loner. Moreover, the only active remaining blue nodes are those blue nodes temporarily matched to the remaining red nodes. Thus, after each epoch, the number of remaining active red nodes and the number of remaining active blue nodes are equal. From this, we can conclude that since \(j'\ge 2\), at the start of epoch \(j'\), the number of active blue nodes is at most \(\eta \).

Using Lemma 2, we infer that in part 1 of stage II, each blue neighbor of a loner-parent is w.h.p. recruited by a red loner-parent. Thus, in particular, each loner is recruited with high probability. Hence, at the start of part 2 of stage II, each remaining active blue node has at least 2 red node neighbors. Since each non-loner-parent red node is active in part 2 of stage II with probability 1/2, and because in part 2 of stage II each active blue node that has an active red node neighbor gets recruited with high probability (by Lemma 2), each blue node remains active after part 2 of stage II with probability at most 1/4. We know that because of the previous paragraph, the number of active remaining blue nodes at the start of part 2 of stage II is at most \(\eta \). Hence, the expected number of blue nodes remaining active after part 2 is at most \(\frac{\eta }{4}\). This completes the proof of the lemma. \(\square \)

Lemma 4

With high probability, the bipartite assignment algorithm creates a collision-free assignment.

Proof

We show that if there exist blue nodes \(u_1\) and \(u_2\) (\(u_1 \ne u_2\)) and their respective red parents \(v_1\) and \(v_2\) (\(v_1 \ne v_2\)), all four with rank \(i\), then with high probability, \(H\) must not have any edge between \(u_2\) and \(v_1\), or between \(u_1\) and \(v_2\). For the sake of contradiction, and without loss of generality, suppose that there is an edge between \(u_2\) and \(v_1\). Figure 3 shows the configuration of these four nodes. Since \(v_2\) and \(u_2\) have rank \(i\), blue node \(u_2\) must have been a loner when \(v_2\) recruited it. Thus, \(v_2\) recruited \(u_2\) after \(v_1\) became inactive. Hence, in the epoch that \(v_1\) recruited \(u_1\), \(u_2\) was active. Therefore, using Lemma 2 we get that in the part 1 of the epoch in which \(v_1\) recruited \(u_1\), \(u_2\) must have been w.h.p. recruited by either \(v_1\) or some other loner-parent. Since \(v_2\ne v_1\) recruited \(u_2\), we get that \(v_2\) must have been that other loner parent. This means that at that time, \(v_2\) had a loner child (\(\ne \) \(u_2\)) and thus, \(v_2\) has recruited more than one child of rank \(i\). This means that \(v_2\) must have had rank \(i+1\) which contradicts with the assumption that \(v_2\) has rank \(i\). \(\square \)

2.2.4 Pipelining the GST construction

Note that in the algorithm described in Sect. 2.2.3 where we are working on the assignment problem between levels \(l-1\) and \(l\), once we are done with the assignment problem of ranks \(i\) and \(i-1\), nodes of level \(l-1\) that receive rank \(i\) are already determined, i.e., no other node in level \(l-1\) will receive rank \(i\). Thus, we can solve the two problems of rank \(i-2\) assignment between levels \(l-1\) and \(l\) and rank \(i\) assignments between levels \(l-2\) and \(l-1\), essentially simultaneously, by interleaving them in even and odd rounds. Using the same idea, it is easy to see that one can pipe-line the assignment problems of different ranks between different levels. Then, the assignment problem between levels \(l-1\) and \(l\) starts after \(\varTheta ((D-l) \log ^4 n )\) rounds. Thus, the assignment problem of largest possible rank between levels \(0\) and \(1\) starts after \(\varTheta (D \log ^4 n)\) rounds. The largest rank is at most \(\lceil \log n \rceil \). Since each rank takes \(\varTheta (\log ^4 n)\) rounds, the whole GST construction problem finishes after \(\varTheta (D \log ^4{n})\) rounds.