Keywords

1 Introduction

Population protocols have been introduced in [1, 2] as a restricted yet useful subclass of general distributed protocols. In population protocols each agent has a constant amount of local storage, and during the protocol execution pairs of agents are selected and permitted to interact. The selection of pairs is assumed to be done by an adversary bound by a fairness condition. The fairness condition ensures that the adversary cannot trivially stall the protocol. A typical fairness condition requires that every configuration that stays reachable during an infinite execution is reached infinitely many times.

Population protocols have been studied from various points of view, such as expressive power [5], verification complexity [19], time to convergence [3, 17], privacy [13], impact of different interaction scheduling [10] etc. Multiple related models have been introduced. Some of them change or restrict the communication structure: this is the case for immediate, delayed, and queued transmission and observation [5], as well as for broadcast protocols [18]. Some explore the implications of adding limited amounts of storage (below the usual linear or polynomial storage permitted in traditional distributed protocols): this is the case for community protocols [23] (which allow an agent to recognise a constant number of other agents), PALOMA [11] (permitting logarithmic amount of local storage), mediated population protocols [26] (giving some constant amount of common storage to every pair of agents), and others.

The original target application of population protocols and related models is modelling networks of restricted sensors, starting from the original paper [1] on population protocols. On the other hand, verifying distributed algorithms benefits from translating the algorithms in question or their parts into a restricted setting, as most problems are undecidable in the unrestricted case. Both applications motivate study of fault tolerance. Some papers on population protocols and related models [4, 12, 23, 24] consider questions of fault tolerance, but in the context of expressive power the fault is typically expected to be either a total agent failure or a Byzantine failure. There are some exceptions such as a study of fine-grained notions of unreliability [14, 15] in the context of step-by-step simulation of population protocols by distributed systems with binary interactions. However, these studies answer a completely different set of questions, as they are concerned with simulating a protocol as a process as opposed to designing a protocol to achieve a given result no matter in what way.

In a practical context, many distributed algorithms pay attention to a specific kind of failure: message loss. While the eventual convergence approach typical in study of population protocols escapes the question of availability during a temporary network partition (the problem studied, for example, in [22]), the onset of a network partition may include message loss in the middle of an interaction. In such a situation the participants do not always agree whether the interaction has succeeded or failed. In terms of population protocols, one of the agents assumes that an interaction has happened and updates the local state, while a counterparty thinks the interaction has failed and keeps the old state.

In the present paper we study the expressive power of a very wide class of models with interacting constant-storage agents when unreliability of communication is introduced. This unreliability corresponds to the loss of atomicity of interactions due to message loss. Indeed, in the distributed systems ensuring that both sides agree on whether the interaction has taken place is often the costliest part; a special case of it is “exactly-once” message arrival, known to be much more complex to ensure than “at most-once”. We model such loss of atomicity by allowing some agents to update their state based on an interaction, while other agents keep their original state because they assume the interaction has failed. For a bit more generality, corresponding, for example, to request-response interactions with the response being impossible if the request is lost, we allow to require that some agents can only update their state if the others do.

We consider the expressive power in the context of computing predicates by protocols with eventual convergence of individual opinions. We show that under very general conditions the expressive power of protocols with unreliable communication coincides with the expressive power of immediate observation population protocols. Immediate observation population protocols, modelling interactions where an agent can observe the state of another one without the observe noticing, provide a model that inherently tolerates unreliability and is considered a relatively weak model in the fully reliable case. This model also has other nice properties, such as relatively low complexity (\(\mathbf {PSPACE}\)-complete) of verification tasks [21]. Our results hold under any definition of fairness satisfying two general assumptions (see Definition 5), including all the usually used versions of fairness.

We prove it by observing a general structural property shared by all protocols with unreliable communication. Informally speaking, protocols with unreliable communication have some special fair executions which can be extended by adding an additional agent with the same initial and final state as a chosen existing one. This property is similar to the copycat arguments used, for example, for proving the exact expressive power of immediate observation protocols. The usual structure of the copycat arguments includes a proof that we can pick an agent in an execution and add another agent (copycat) which will repeat all the state transitions of the chosen one. In the immediate observation case the corresponding property is almost self-evident once defined. A slightly stronger but still straightforward argument is needed in the case of reconfigurable broadcast networks [8]. The latter model is equivalent to unreliable broadcast networks; a sender broadcasts a message and changes the local state, and an arbitrary set of receivers react to the message (immediately). However, unlike all the previous uses of the copycat-like arguments in the context of population protocols and similar models, proving the necessary copycat-like property for a general notion of protocols with unreliable communication (sufficient to handle assymmetry of message loss where loss for sender requires loss for receiver) requires careful analysis using different techniques.

Note that although the natural way to design population protocols for our setting involves the use of immediate observation population protocols, we still need to rule out additional opportunities arising from the fact that eventually a two-agent interaction with both agents correctly updated will happen. However, in contrast to self-stabilising protocols [6, 16], the protocols cannot rely on the message loss being absent for an arbitrarily long time.

Surprisingly, asynchronous transmission and receipt of messages, which provides more expressive power than immediate observation population protocols in the reliable setting, turns out to have strictly less expressive power in the unreliable setting. Note that message reordering is allowed already in the reliable setting, while unreliability is essentially a generalisation of message loss. One could say that an unbounded delay in message delivery becomes a liability instead of an asset once there is message loss.

The rest of the present paper is organised as follows. First, in Sect. 2 we define a general protocol framework generalising many previously studied approaches. Then in Sect. 3 we summarise the results from the literature on the expressive power of various models covered by this framework. Afterwards in Sect. 4 we formally define our general notion of a protocol with unreliable communication. Then in Sect. 5 we formalise the common limitation of all the protocols with unreliable communication, and provide the proof sketches of this restriction and the main result. Afterwards in Sect. 6 we show that fully asynchronous (message-based) models become strictly less powerful than immediate observation in the unreliable setting. The paper ends with a brief conclusion and some possible future directions.

Due to the space constraints the detailed proofs are provided in the full version [27].

1.1 Main Results (Preview)

The precise statements of our results require the detailed definitions introduced later. However, we can roughly summarise them as follows.

First, we characterise the expressive power of all fixed-memory protocols given unreliable communication.

Proposition 1

Adding unreliability of communication to population protocols restricts the predicates they can express to boolean combinations of comparisons of arguments with constants.

This is the same expressive power as the immediate observation protocols.

Next we show that unreliability of communication changes the expressive power non-monotonically for some natural classes.

Proposition 2

Queued transmission protocols with unreliable communication are strictly less expressive than immediate observation population protocols (with or without unreliable communication).

Note that without unreliability queued transmission protocols are strictly more expressive than immediate observation population protocols.

2 Basic Definitions

2.1 Protocols

We consider various models of distributed computation where the number of agents is constant during protocol execution, each agent has a constant amount of local storage, and agents cannot distinguish each other except via the states. We provide a general framework for describing such protocols. Note that we omit some very natural restrictions (such as decidability of correctness of a finite execution) because they are irrelevant for the problems we study. We allow agents to be distinguished and tracked individually for the purposes of analysis, even though they cannot identify each other during the execution of the protocol.

We will use the following problem to illustrate our definitions: the agents have states \(q_0\) and \(q_1\) corresponding to input symbols 0 and 1 and aim to find out if all the agents have the same input. They have an additional state \(q_\bot \) to represent the observation that both input symbols were present. We will define four protocols for this problem using different communication primitives.

  • Two agents interact and both switch to \(q_\bot \) unless they are in the same state (population protocol interaction).

  • An agent observes another agent and switches to \(q_\bot \) if they are in different states (immediate observation).

  • An agent can send a message with its state, \(q_0\), \(q_1\) or \(q_\bot \). An agent in a state \(q_0\) or \(q_1\) can receive a message (any of the pending messages, regardless of order); the agent switches to \(q_\bot \) if the message contains a state different from its own (queued transmission).

  • An agent broadcasts its state without changing it; each other agent receives the broadcast simultaneously and switches to \(q_\bot \) if its state is different from the broadcast state (broadcast protocol interaction).

Definition 1

A protocol is specified by a tuple \((Q,M,\varSigma ,I,o,\mathrm {Tr},\varPhi )\), with components being a finite nonempty set Q of (individual agent) states, a finite (possibly empty) set M of messages, a finite nonempty input alphabet \(\varSigma \), an input mapping function \(I: \varSigma \rightarrow Q\), an individual output function \(o: Q\rightarrow \{true,false\}\), a transition relation \(\mathrm {Tr}\) (which is described in more details below), and a fairness condition \(\varPhi \) on executions.

The protocol defines evolution of populations of agents (possibly with some message packets being present).

Definition 2

A population is a pair of sets: A of agents and P of packets. A configuration C is a population together with two functions, \(C_A: A\rightarrow Q\) provides agent states, and \(C_P: P\rightarrow M\) provides packet contents.

Note that if M is empty, then P must also be empty. As the set of agents is the domain of the function \(C_A\), we use the notation \(\mathrm {Dom}(C_A)\) for it. The same goes for the set of packets \(\mathrm {Dom}(C_P)\). Without loss of generality \(\mathrm {Dom}(C_P)\) is a subset of a fixed countable set of possible packets.

The message packets are only used for asynchronous communication; instant interaction between agents (such as in the classical rendezvous-based population protocols or in broadcast protocols) does not require describing the details of communication in the configurations.

Example 1

The four example protocols have the same set of states, namely \(Q=\{q_0,q_1,q_\bot \}\). The first two protocols have the empty set of messages, and the last two have the set of messages \(M=\{m_0,m_1,m_\bot \}\). The example protocols all have the same input alphabet \(\varSigma =\{0,1\}\), input mapping \(I: i\mapsto q_i\), and output mapping \(o: q_0\mapsto \mathrm {true}, q_1\mapsto \mathrm {true}, q_\bot \mapsto \mathrm {false}\).

The definition of the transition relation uses the following notation.

Definition 3

For a function f and \(x\notin \mathrm {Dom}(f)\) let \(f\cup \{x\mapsto y\}\) denote the function g defined on \(\mathrm {Dom}(f)\cup \{x\}\) such that \(g\mid _{\mathrm {Dom}(f)}=f\) and \(g(x)=y\). For \(u\in \mathrm {Dom}(f)\) let \(f[u\mapsto v]\) denote the function h defined on \(\mathrm {Dom}(f)\) such that \(h\mid _{\mathrm {Dom}(f)\setminus \{u\}}=f\mid _{Dom(f)\setminus \{u\}}\) and \(h(u)=v\). For symmetry, if \(w=f(u)\) let \(f\setminus \{u\mapsto w\}\) denote restriction \(f\mid _{\mathrm {Dom}(f)\setminus \{u\}}\).

Use of this notation implies an assertion of correctness, i.e. \(x\notin \mathrm {Dom}(f)\), \(u\in \mathrm {Dom}(f)\), and \(w=f(u)\). We use the same notation with a configuration C instead of a function if it is clear from context whether \(C_A\) or \(C_P\) is modified.

Now we can describe the transition relation that tells us which configurations can be obtained from a given one via a single interaction. In order to cover broadcast protocols we define the transition relation as a relation on configurations. The restrictions on the transition relation ensure that the protocol behaves like a distributed system with arbitrarily large number of anonymous agents.

Definition 4

The transition relation of a protocol is a set of triples \((C,A^\odot ,C')\), called transitions, where C and \(C'\) are configurations and \(A^\odot \subset \mathrm {Dom}(C_A)\) is the set of active agents (of the transition); agents in \(\mathrm {Dom}(C_A)\setminus A^\odot \), are called passive. We write \(C\xrightarrow {A^\odot } C'\) for \((C, A^\odot , C')\in \mathrm {Tr}\), and let \(C\rightarrow C'\) denote the projection of \(\mathrm {Tr}\): \(C\rightarrow {}C'\Leftrightarrow {}\exists {A^\odot }:C\xrightarrow {A^\odot }C'\). The transition relation must satisfy the following conditions for every transition \(C\xrightarrow {A^\odot } C'\):

  • Agent conservation. \(\mathrm {Dom}(C_A) = \mathrm {Dom}(C'_A)\).

  • Agent and packet anonymity. If \(h_A\) and \(h_P\) are bijections such that \(D_A = C_A \circ h_A\), \(D'_A = C'_A \circ h_A\), \(D_P= C_P \circ h_P \), and \(D'_P = C'_P \circ h_P \), then \(D\xrightarrow {h^{-1}(A^\odot )}D'\).

  • Possibility to ignore extra packets. For every \(p \notin \mathrm {Dom}(C_P) \cup \mathrm {Dom}(C'_P)\) and \(m \in M\): \(C \cup \{p \mapsto m\} \xrightarrow {A^\odot } C' \cup \{p \mapsto m\}\).

  • Possibility to add passive agents. For every agent \(a \notin \mathrm {Dom}(C_A)\) and \(q \in Q\) there exists \(q' \in Q\) such that: \(C \cup \{a \mapsto q\} \xrightarrow {A^\odot } C' \cup \{a \mapsto q'\}\).

Informally speaking, the active agents are the agents that transmit something during the interaction. The passive agents can still observe other agents and change their state. The choice of active agents is used for the definition of protocols with unreliable communication, as a failure to transmit precludes success of reception. The formal interpretation will be provided in Definition 8.

Many models studied in the literature have the transition relation defined using pairwise interaction. In these models the transitions are always changing the states of two agents based on their previous states. When discussing such protocols, we will use the notation \((p, q)\rightarrow (p', q')\) for a transition where agents in the states p and q switch to states \(p'\) and \(q'\), correspondingly.

Example 2

The four example protocols have the following transition relations.

  • In the first protocol for a configuration C and two agents \(a,a'\in \mathrm {Dom}(C_A)\) such that \(C_A(a)\ne C_A(a')\) we have \(C\xrightarrow {\{a,a'\}} C[a\mapsto q_\bot ][a'\mapsto q_\bot ]\) (in other notation, \((C,\{a,a'\}, C[a\mapsto q_\bot ][a'\mapsto q_\bot ])\in \mathrm {Tr}\)).

  • In the second protocol for a configuration C and two agents \(a,a'\in \mathrm {Dom}(C_A)\) such that \(C_A(a)\ne C_A(a')\) we have \(C\xrightarrow {\{a\}} C[a\mapsto q_\bot ]\). We can say that a observes \(a'\) in a different state and switches to \(q_\bot \).

  • In the third protocol there are two types of transitions. Let a configuration C be fixed. For an agent \(a\in \mathrm {Dom}( C_A)\), \(i\in \{0,1,\bot \}\) such that \(C_A(a)=q_i\), and a new message identity \(p\notin \mathrm {Dom}( C_P)\) we have \(C\xrightarrow {\{a\}} C\cup \{p\mapsto m_i\}\) (sending a message). If \(C_A(a)=q_i\) for some \(i\in \{0,1\}\), for each message \(p\in \mathrm {Dom}(C_P)\), we also have \(C \xrightarrow {\{a\}} C[a\mapsto q']\setminus \{p\mapsto C_P(p)\}\) where \(q'\) is equal to \(q_i\) if \(C_P(p)=m_i\) and \(q_\bot \) otherwise (receiving a message).

  • In the fourth protocol, for a configuration C and an agent \(a\in \mathrm {Dom}( C_A)\) we can construct \(C'\) by replacing \(C_A\) with \(C'_A\) that maps each \(a'\in \mathrm {Dom}( C_A)\) to \(C_A(a')\) if \(C_A(a)=C_A(a')\) and \(q_\bot \) otherwise. Then we have \(C\xrightarrow {\{a\}}C'\). We can say that a broadcasts its state and all the agents in the other states switch to \(q_\bot \).

2.2 Fair Executions

In this section we define the notion of fairness. This notion is traditionally used to exclude the most pathological cases without a complete probabilistic analysis of the model. For the population protocols fairness has been a part of the definition since the introduction [1, 2]. However, in the general study of distributed computation there has long been some interest in comparing effects of different approaches to fairness in execution scheduling [7]. For example, the distinction between weak fairness and strong fairness and the conditions where one can be made to model the other has been studied in [25]. The difference between weak and strong scheduling is that strong fairness executes infinitely often every interaction that is enabled infinitely often, while weak fairness only guarantees anything for continuously enabled interactions. As there are multiple notion of fairness in use, we define their basic common traits. Our results hold for all notions of fairness satisfying these basic requirements, including all the notions of fairness used in the literature, as well as much stronger and much weaker fairness conditions.

Definition 5

An execution is a sequence (finite or infinite) \(C_n\) of configurations such that at each moment i either nothing changes, i.e. \(C_i=C_{i+1}\) or a single interaction occurs, i.e. \(C_i\rightarrow C_{i+1}\). A configuration \(C'\) is reachable from configuration C if there exists an execution \(C_0,\ldots ,C_n\) with \(C_0=C\) and \(C_n=C'\) (and unreachable otherwise).

A protocol defines a fairness condition \(\varPhi \) which is a predicate on executions. It should satisfy the following properties.

  • A fairness condition is eventual, i.e. every finite execution can be continued to an infinite fair execution.

  • A fairness condition ensures activity, i.e. if an execution contains only configuration C after some moment, only C itself is reachable from C.

Definition 6

The default fairness condition accepts an execution if every configuration either becomes unreachable after some moment, or occurs infinitely many times.

It is clear that the default fairness condition ensures activity.

Lemma 1

[adapted from [5]] Default fairness condition is eventual.

The fairness condition is sometimes said to be an approximation of probabilistic behaviour. In our general model the default fairness condition provides executions similar to random ones for protocols without messages but not always for protocols with messages. The arguments from [20] with minimal modification prove this. The core idea in the case without messages is observing we have a finite state space reachable from any given configuration; a random walk eventually gets trapped in some strongly connected component, visiting all of its states infinitely many time. If we do have messages, the message count might behave like a biased random walk; while consuming all the messages stays possible in principle, with probability one it only happens a finite number of times.

2.3 Functions Implemented by Protocols

In this section we recall the standard notion of a function evaluated by a protocol. Here the standard definition generalises trivially.

Definition 7

An input configuration is a configuration where there are no packets and all agents are in input states, i.e. \(P=\varnothing \) and \(\mathrm {Im}(C_A) \subseteq \mathrm {Im}(I)\) where \(\mathrm {Im}\) denotes the image of a function. We extend I to be applicable to multisets of input symbols. For every \(\overline{x}\in \mathbb {N}^\varSigma \), we define \(I(\overline{x})\) to be a configuration of \(|\overline{x}|\) agents with \(\sum _{I(\sigma )=q_i}\overline{x}(\sigma )\) agents in input state \(q_i\) (and no packets).

A configuration C is a consensus if the individual output function yields the same value for the states of all agents, i.e. \(\forall a,a'\in \mathrm {Dom}(C_A): o(C_A(a))=o(C_A(a'))\). This value is the output value for the configuration. C is a stable consensus if all configurations reachable from C are consensus configurations with the same value.

A protocol implements a predicate \(\varphi : \mathbb {N}^\varSigma \rightarrow \{true,false\}\) if for every \(\overline{x}\in \mathbb {N}^\varSigma \) every fair execution starting from \(I(\overline{x})\) reaches a stable consensus with the output value \(\varphi (\overline{x})\). A protocol is well-specified if it implements some predicate.

Example 3

It is easy to see that each of the four example protocols implements the predicate \(\varphi (\overline{x})\Leftrightarrow (x(0)=0)\vee (x(1)=0)\) on \(\mathbb {N}^2\). In other words, the protocol accepts the input configurations where one of the two input states has zero agents and rejects the configurations where both input states occur.

This framework is general enough to define the models studied in the literature, such as population protocols, immediate transmission protocols, immediate observation population protocols, delayed transmission protocols, delayed observation protocols, queued transmission protocols, and broadcast protocols. The details can be found in the full version.

3 Expressive Power of Population Protocols and Related Models

In this section we give an overview of previously known results on expressive power of various models related to population protocols. We only consider predicates, i.e. functions with the output values being \(\mathrm {true}\) and \(\mathrm {false}\) because the statements of the theorems become more straightforward in that case.

The expressive power of models related to population protocols is expressed in terms of semilinear, \(\mathbf {coreMOD}\), and counting predicates. Semilinear predicates on tuples of natural numbers can be expressed using the addition function, remainders modulo constants, and the order relation, such as \(x+x \ge {} y+3\) or \(x\mod 7=3\). Roughly speaking, \(\mathbf {coreMOD}\) is the class of predicates that become equivalent to modular equality for inputs with only large and zero components. An example could be \((z=1 \wedge x\ge {}y) \vee (x+y \mod 2 = 0)\), a semilinear predicate which becomes a modular equality whenever \(z=0\) or z is large (i.e. \(z\ge {}2\)). Counting predicates are logical combinations of inequalities including one coordinate and one constant each, for example, \(x\ge {3}\).

Theorem 1

(see [5] for details). Population protocols and queued transmission protocols can implement precisely semilinear predicates.

Immediate transmission population protocols and delayed transmission protocols can implement precisely all the semilinear predicates that are also in \(\mathbf {coreMOD}\).

Immediate observation population protocols implement counting predicates.

Delayed observation protocols implement the counting predicates where every constant is equal to 1.

Theorem 2

(see [9] for details). Broadcast protocols implement precisely the predicates computable in nondeterministic linear space.

4 Our Models

4.1 Proposed Models

We propose a general notion of an unreliable communication version of a protocol. Our notion models transient failures, so the set of agents is preserved. The intuition we formalise is the idea that for every possible transition some agents may fail to update their states (and keep their corresponding old states). We also require that for some passive agent to receive a transmission, the transmission has to occur (and active agents who transmit do not update their state if they fail to transmit, although a successful transmission can still fail to be received).

Definition 8

A protocol with unreliable communication, corresponding to a protocol \(\mathcal {P}\), is a protocol that differs from \(\mathcal {P}\) only in the transition relation. For every allowed transition \(C\xrightarrow {A^\odot } C'\) we also allow all the transitions \(C\xrightarrow {A^\odot } C''\) where \(C''\) satisfies the following conditions.

  • Population preservation. \(\mathrm {Dom}(C''_A)=\mathrm {Dom}(C'_A)\), as well as \(\mathrm {Dom}(C''_P) = \mathrm {Dom}(C'_P)\).

  • State preservation. For every agent \(a\in \mathrm {Dom}(C''_A)\) we have \(C''_A(a)\in \{C_A(a), C'_A(a)\}\).

  • Message preservation. For every packet \(p\in \mathrm {Dom}(C''_P)\): \(C''_P(p)=C'_P(p)\).

  • Reliance on active agents. Either for every agent \(a\notin A^\odot \) we have \(C''_A(a) = C_A(a)\), or for every agent \(a\in A^\odot \) we have \(C''_A(a) = C'_A(a)\).

Example 4

  • Population protocols with unreliable communication allow an interaction to update the state of only one of the two agents.

  • Immediate transmission population protocols with unreliable communication allow the sender to update the state with no receiving agents.

  • Immediate observation population protocols with unreliable communication do not differ from ordinary immediate observation population protocols, because each transition changes the state of only one agent. Failing to change the state means a no-change transition which is already allowed anyway.

  • Queued transmission protocols with unreliable communication allow messages to be discarded with no effect. Note that for delayed observation protocols unreliable communication doesn’t change much, as sending the messages also has no effect.

  • Broadcast protocols with unreliable communication allow a broadcast to be received by an arbitrary subset of agents.

4.2 The Main Result

Our main result is that no class of protocols with unreliable communication can be more expressive than immediate observation protocols.

Definition 9

A cube is a subset of \(\mathbb {N}^k\) defined by a lower and upper (possibly infinite) bound for each coordinate. A counting set is a finite union of cubes.

A counting predicate is a membership predicate for some counting set. Alternatively, we can say it is a predicate that can be computed using comparisons of input values with constants and logical operations.

Theorem 3

The set of predicates that can be implemented by protocols with unreliable communication is the set of counting predicates. All counting predicates can be implemented by (unreliable) immediate observation protocols.

5 Proof of the Main Result

Our main lemma is generalises of the copycat lemma normally applied to specific models such as immediate observation protocols. The idea is that for every initial configuration there is a fair execution that can be extended to a possibly unfair execution by adding a copy of a chosen agent. In some special cases, for example, broadcast protocols with unreliable communication, a simple proof can be given by saying that if the original agent participates in an interaction, the copy should do the same just before the original without anyone ever receiving the broadcasts from the copy. The copycat arguments are usually applied to models where a similar proof suffices. The situation is more complex for models like immediate transmission protocols with unreliable communication. As a message cannot be received without being sent, the receiver cannot update its state if the sender doesn’t. We present an argument applicable in the general case.

Definition 10

Let E be an arbitrary execution of protocol P with initial configuration C. Let \(a\in {}\mathrm {Dom}(C_A)\) be an agent in this execution. Let \(a'\notin \mathrm {Dom}(C_A)\) be an agent, and \(C'=C\cup \{a'\mapsto C_A(a)\}\). A set \(\mathfrak {E}_a\) of executions starting in configuration \(C'\) is a shadow extension of the execution E around the agent a if the following conditions hold:

  • removing \(a'\) from each configuration in any execution in \(\mathfrak {E}_a\) yields E;

  • for each moment during the execution, there is a corresponding execution in \(\mathfrak {E}_a\) such that a and \(a'\) have the same state at that moment.

The added agent \(a'\) is a shadow agent, and elements of \(\mathfrak {E}_a\) are shadow executions. A protocol P is shadow-permitting if for every configuration C there is a fair execution starting from C that has a shadow extension around each agent \(a\in {}\mathrm {Dom}(C_A)\).

Note that the executions in \(\mathfrak {E}_a\) might not be fair even if E is fair.

Not all population protocols are shadow-permitting. For example, consider a protocol with one input state \(q_0\), additional states \(q_+\) and \(q_-\), and one transition \((q_0,q_0)\rightarrow (q_+,q_-)\). As the number of agents in the states \(q_+\) and \(q_-\) is always the same, one can’t add a single extra agent going from state \(q_0\) to state \(q_+\).

Lemma 2

All protocols with unreliable communication are shadow-permitting.

The intuition behind the proof is the following. We construct a fair execution together with the shadow executions and keep track what states can be reached by the shadow agents. The set of reachable states will not shrink, as the shadow agent can always just fail to update. If an agent a tries to move from a state q to a state \(q'\) not reachable by the corresponding shadow agent in any of the shadow executions, we “split” the shadow execution reaching q: one copy just stays in place, and in the other the shadow agent \(a'\) takes the place of a in the transition while a keeps the old state. In the main execution there is no \(a'\) so a participates in the interaction but fails to update. Afterwards we restart the process of building a fair execution.

We also use a straightforward generalisation of the truncation lemma from [5]. The lemma says that all large amounts of agents are equivalent for the notion of stable consensus.

Definition 11

A protocol is truncatable if there exists a number K such that for every stable consensus adding an extra agent with a state q that is already represented by at least K other agents yields a stable consensus.

Lemma 3

[adapted from [5]] All protocols (not necessarily with unreliable communication) are truncatable.

Lemma 4

If a predicate \(\varphi \) can be implemented by a shadow-permitting truncatable protocol, then \(\varphi \) is a counting predicate.

For the lower bound, we adapt the following lemma from [5].

Lemma 5

All counting predicates can be implemented by immediate observation protocols (possibly with unreliable communication), even if the fairness condition is replaced with an arbitrary different (activity-ensuring) one.

Theorem 3 now follows from the fact that all the protocols with unreliable communication are shadow-permitting (by Lemma 2) and truncatable (by Lemma 3), therefore they only implement counting predicates. By Lemma 5 all counting predicates can be implemented.

6 Non-monotonic Impact of Unreliability

In this section we observe that, surprisingly, while delayed transmission protocols and queued transmission protocols are more powerful than immediate observation population protocols, their unreliable versions are strictly less expressive than immediate observation population protocols (possibly with unreliable communication).

Definition 12

A protocol is fully asynchronous if for each allowed transition \((C,A^\odot ,C')\) the following conditions hold.

  • There is exactly one active agent, i.e. \(|A^\odot |=1\).

  • No passive agents change their states.

  • Either the packets are only sent or the packets are only consumed, i.e. either \(\mathrm {Dom}(C_P)\subseteq \mathrm {Dom}(C'_P)\) or \(\mathrm {Dom}(C_P)\supseteq \mathrm {Dom}(C'_P)\). Packet contents do not change, i.e. \(C_P\mid _{\mathrm {Dom}(C_P)\cap \mathrm {Dom}(C'_P)} = C'_P\mid _{\mathrm {Dom}(C_P)\cap \mathrm {Dom}(C'_P)}\).

To prove that fully asynchronous protocols with unreliable communication cannot compute some counting predicates, we consider a simple predicate: all agents have the same input, and the protocol must determine whether there are at least two agents.

Theorem 4

A well-specified fully asynchronous protocol with unreliable communication having a single-letter input alphabet yields the same value for the input configurations with one and two agents, correspondingly.

Remark 1

This result doesn’t mean that fundamentally asynchronous nature of communication prevents us from using any expressive models for verification of unreliable systems. It is usually possible to keep enough state to implement, for example, Immediate observation via request and response.

7 Conclusion and Future Directions

We have studied unreliability based on message loss, a practically motivated approach to fault tolerance in population protocols. We have shown that inside a general framework of defining protocols with unreliable communication we can prove a specific structural property that bounds the expressive power of protocols with unreliable communication by the expressive power of immediate observation population protocols. Immediate observation population protocols permit verification of many useful properties, up to well-specification, correctness and reachability between counting sets, in polynomial space. We think that relatively low complexity of verification together with inherent unreliability tolerance and locally optimal expressive power under atomicity violations motivate further study and use of such protocols.

It is also interesting to explore if for any class of protocols adding unreliability makes some of the verification tasks easier. Both complexity and expressive power implications of unreliability can be studied for models with larger per-agent memory, such as community protocols, PALOMA and mediated population protocols. We also believe that some models even more restricted than community protocols but still permitting a multi-interaction conversation are an interesting object of study both in the reliable and unreliable settings.