Introduction

How a group of individuals reaches a consensus through local interactions and without any centralised authority is widely studied in social, biological, political and information sciences1,2,3,4,5,6. The voter model7,8, thanks to its simplicity and tractability, has been employed both to describe a large variety of natural systems (from plants to humans) across all scales of biological complexity9,10,11,12 and to engineer decentralised artificial systems, such as robot swarms5. Recent studies have, however, shown that the consensus dynamics of the voter model can easily be jeopardised by even a minimal number of self-willed individuals13,14,15. For each alternative, the presence of just one inflexible individual (also called zealot or stubborn in literature), influencing others but not changing their opinion, prevents the entire population from reaching any stable agreement (i.e. a large majority with the same opinion)13. Equivalently, in the absence of zealots, the population cannot achieve a stable majority if every individual makes spontaneous (asocial) changes of opinion, even if sporadic (modelled as the noisy voter model14).

Here, we show that a model of comparable simplicity, the cross-inhibition model16,17,18, makes the population resilient to the presence of both self-willed and inflexible individuals and able to reach a stable consensus. These results contribute to explaining why inhibitory signals evolved in biological systems that need group consensus despite operating in noisy contexts, such as social insect colonies and neuronal populations16,19. Additionally, our findings are relevant to the design of minimal interaction patterns that allow social networks and decentralised robotic systems to reach an agreement, despite the presence of noise or the asocial behaviour of some individuals, whether unintentional or malicious20,21. In both the voter and cross-inhibition models, population consensus emerges from pairwise interactions between individuals, with a randomly selected speaker and listener. The main difference between the two models is that when the speaker and the listener have different opinions, in the voter model, the listener directly switches to the speaker’s opinion, while in the cross-inhibition model, the listener becomes undecided or, in other words, gets inhibited and remains without an opinion. Only undecided individuals switch to the speaker’s opinion (see Fig. 1a, b).

Fig. 1: In nature and robotics, populations of simple individuals can make consensus decisions through the cross-inhibition model.
figure 1

a, b Motifs of the voter model and the cross-inhibition model. The commitment of an individual (in circles) changes upon interaction with peers (illustrated by solid arrows with a small circle that indicates the peer’s commitment). In the voter model, individuals directly switch their commitment between the two alternative opinions, X and Y. Instead, in b, individuals with different opinions cross-inhibit each other, and the inhibited individual reverts to an uncommitted state. Individuals without any opinion can be recruited by committed peers. Both models can be subject to noise (dashed arrow) that corresponds to an asocial switch of opinion (independent of peers) which can be caused by self-sourcing information. Noise in the cross-inhibition model can be of two alternative types (dashed or dotted arrows). c Zealots are inflexible and never change their opinion upon receiving neighbours' votes. d Biological inspiration can be employed to design safe robotic systems able to reach a stable consensus. The cross-inhibition model has been first introduced to describe house-hunting in honeybees16. We used the biological model to design a resilient collective behaviour in a swarm of 100 Kilobot robots. Kilobots are simple robots widely used for collective intelligence studies53. e, f Bifurcation analysis of the mean-field ODE models can be used to identify when a large population can reach an agreement or remains split into two polarised factions. We consider the presence of two groups of zealots, equally split between the two opinions. We display the fixed points of Eqs. (6) and (7) (solid green lines are stable and dashed blue are unstable) in overlay to the stationary probability distribution of the master equation for a population of S = 200 individuals (red colour maps computed using the master equation’s analytical solution of Eqs. (19) and (31)). e Using the voter model, the population gets locked into an indecision state as soon as a minimal number of zealots (x-axis) are introduced. f Instead, using the cross-inhibition model, the population demonstrates resilience against relatively high levels of zealotry. For values of asocial behaviour smaller than the supercritical pitchfork bifurcation, the system has two stable fixed points representing population agreement for either option. Instead, after bifurcation (when zealots comprise more than 40% of the entire population), the system with high levels of asocial behaviour has a single attractor, representing indecision: a deadlocked population unable to make any collective decision.

We model asocial behaviour as noise or zealotry, which have both been shown to be mathematically equivalent in the macroscopic-level model14. Instead, at the microscopic level, the individual asocial behaviour is distinct in the case of noise and zealotry. Through the noise, every individual sporadically changes their opinion independently of others’ opinions. Zealots always ignore others’ opinions and remain unmovable (Fig. 1c); a form of antisocial behaviour more extreme than noise that, however, is followed by only a minority of the population22,23. Individuals may have asocial behaviour because they fail to comply with social pressure or retrieve information from independent sources other than peers24. Either situation is common in several biological and artificial systems that rely on noisy communication and operate in complex environments with abundant sources of conflicting information. Identifying mechanisms that provide resiliency and stability to the group is key to understanding how living groups can achieve social homoeostasis25. The cornerstone mechanism of the cross-inhibition model is inhibitory signalling between individuals with opposing opinions. This mechanism is widespread in nature. Evolution has recurrently reached the use of inhibition as the most efficient method to ‘design’ interactions of complex systems. Inhibition can be found in apparently very different systems, such as in cell metabolism26, neuronal activity19, honeybee house-hunting16 and human societies27, however, the underlying models can have striking similarities in their structure and dynamics26,28,29,30. Previous studies have shown that cross-inhibition can break symmetry in the absence of asocial behaviour16,18. Our analysis reveals an additional key feature of the minimal cross-inhibition model: the ability to reach a stable consensus despite asocial behaviour. This feature is pivotal in biological systems seeking to achieve coordinated actions, for example, house-hunting honeybees16, neurons firing to discriminate between stimuli19, and individuals agreeing on social norms3. Identifying minimal mechanisms capable of leading to a stable consensus despite asocial behaviour is also relevant in shielding artificial systems from deficient behaviour caused by malfunction or cyber-attacks (Fig. 1d).

Results and discussion

Through mean-field analysis, we show that the ODE system describing the cross-inhibition model predicts a stable group consensus (two stable fixed points) despite a considerable proportion of zealots or, equivalently, a high level of noise. More precisely, when zealots comprise less than 40% of the entire population or the noise is below a given threshold (i.e. prior to the supercritical pitchfork bifurcations in Figs. 1f and 2e, g), the ODE system has two stable fixed points representing population agreement for either option. Instead, the voter model falls into a permanent undecided state as soon as minimal asocial behaviour is present (single green attractor representing indecision in Figs. 1e and 2a, c). However, previous research has shown that, in small populations, the voter model can break the symmetry (i.e. the decision deadlock) induced by fluctuations of a finite-sized system31,32. Additionally, the stable points of the cross-inhibition model are not absorbing states—that is, there is always the possibility that through random fluctuations, the system changes state and moves from the basin of attraction of a stable point to the other. Therefore we conduct semi-analytical and computational analyses of the master equation of the two models in order both to study the stochastic dynamics of finite-sized systems and to quantify the stability of the fixed points.

Finite-sized system analysis

While the mean-field ODE system of the voter model predicts a single stable point with a polarised population split between the two options, Fig. 2a–d shows that a small-sized system with moderate levels of noise is most often in a state of agreement for either option. However, noise-induced bistability leads to highly unstable consensus states (e.g., see the insets of Fig. 2b, d) and, in addition, is vulnerable to an increase in both the system size S and the noise level σ. As shown by the master equation analysis in Fig. 2a–d, for noise values or population sizes greater than a threshold σ−1 = S, the voter model transits from a regime of decision to a regime of indecision31,32. The impact of zealotry on consensus is even more accentuated. Two equally-sized and relatively small groups of inflexible zealots are sufficient to lock a much larger social population into a state of indecision. While in the voter model, a small level of asocial behaviour impedes population agreement, on the contrary, the cross-inhibition model breaks the symmetry sustained by large proportions of zealots or by high levels of noise (Figs. 1f and 2e–h). For the cross-inhibition model, we consider two alternative types of noise (dashed and dotted lines in Fig. 1b). Through noise type 1, individuals do not immediately change opinions but go through the undecided state before adopting a new opinion (following the inhibitory mechanism). The behaviour of noise type 1 is analogous to the way in which individuals react to social information received from other individuals, including zealots. Instead, noise type 2 corresponds to the way noise is implemented in the voter model with individuals directly switching their opinions (e.g., due to self-sourced environmental information24) and allows a more thorough comparison of the two models. For both types of noise (Fig. 2e–h) and for any tested swarm size (see Supplementary Fig. S1), the cross-inhibition model has qualitatively the same dynamics: it breaks the symmetry despite relatively high levels of social noise.

Fig. 2: Comparison between the voter and the cross-inhibition models in symmetry breaking.
figure 2

The voter model cannot break the symmetry for increasing levels of asocial behaviour or swarm size, instead, the cross-inhibition model can. The red colour maps show the stationary probability distribution (SPD)—computed via the master equations, except (g), which is computed with simulations—of a swarm with S = 200 individuals to be in the collective state indicated on the y-axis (see Methods). The overlying lines show the fixed points of the ODE system (solid green is stable, and dashed blue is unstable). In panels b, d, f, h, each of the main plots shows a number of SPDs that are in correspondence with the colour map on the top. The insets are a trajectory of the change of commitment over time for a representative simulation. Panels ad are computed through Eq. (19), panels e, f through Eq. (31), and g, h computationally using 104 simulations of Gillespie’s stochastic simulation algorithm. ad As previously reported31,32,56, for small levels of noise and small swarms, despite different predictions of the mean-field ODE system, the voter model is often in a state of collective agreement. However, the majority is slim, and the dynamics are highly unstable (e.g., see insets of b and d for σ = 0.005). Additionally, the population rapidly goes to indecision as soon as the noise (in a, b, with S = 200) or the system size (in c, d, with σ = 0.005) increases: the two symmetry-breaking peaks, for −1 and +1, transition into a single peak at x − y = 0, that represents indecision. eh Differently, the cross-inhibition model is able to break the symmetry and reach an agreement despite relatively high levels of noise—that is, the asocial behaviour of individuals that spontaneously change opinion. We report the results for both types of noise (see x-axis and Fig. 1e–g) and for different swarm sizes (in Supplementary Fig. S1). Increasing noise shifts the agreement to values lower than full consensus. Nevertheless, prior to bifurcation, a large majority for one opinion is maintained with very stable dynamics (e.g., see insets of f and h for σ = 0.05 and S = 200, tenfold the noise used in insets of b and d).

Swarm robotics experiments

We tested our theoretical results through a set of experiments with 100 Kilobot robots that locally interact with each other in order to reach an agreement (see Fig. 3a and Supplementary Movie 1, also available at https://youtu.be/mQtLhMqdVWg). The swarm starts from a polarised indecision, with a 50–50 split for either alternative (represented by the blue and red colours). To test the resiliency of the two models against asocial behaviour in the robot swarm, we included 20 zealot robots—equally split between the two options—that only broadcast their opinion but do not listen to others. Figure 3b shows that the remaining 80 robots updating their commitment using the voter model are unable to reach any agreement within one hour. Instead, through the cross-inhibition model, the swarm rapidly selects any of the two alternatives and maintains a stable agreement (Fig. 3c). We can further appreciate the fragility of the voter model in the experiment of Fig. 3d where only four zealots—that is, two robots for each option—have been included in a swarm of 96 cooperative robots. The swarm remains undecided for the most part of the experiment and, in contrast to the cross-inhibition experiment with 20 zealots, the agreement is highly unstable. In Fig. 3e, we show that, in agreement with theory predictions, the stability of the cross-inhibition model can also be undermined by excessively large factions of zealots, in this case 30 zealots. The swarm of 70 cooperative robots is able to frequently attain a large majority for either alternative, however the decision is not stable and, over one hour, the consensus vacillates more than once. Our swarm robotics experiments strengthen the validity of the theoretical models by showing that our mathematical equations can predict the collective behaviour of a swarm of 100 autonomous robots that locally interact with each other. Obtaining such correspondence was not obvious and corroborates the possibility of employing our theoretical results to implement resilient swarms of minimalistic robots.

Fig. 3: Swarm robotics experiments.
figure 3

We implemented the two models on a swarm of 100 Kilobot robots that display their opinion—red or blue—via their colour LED. Green robots have no opinion. a The Kilobots53 move through vibration motors in a flat environment and exchange infra-red messages every 30 s with one another in a range of about 10 cm. Upon receiving a message they change their opinion according to one of the two tested models: the voter and the cross-inhibition model. We included a number of zealot robots that only broadcast but never read other robots' messages, thus never change opinion. We split the zealots equally between the red and blue options. The displayed image was taken with an overhead camera midway through an experiment. Videos are available as Supplementary Movies 25. In panels be, we report the change over time of the number of robots with different commitment states, the red and blue curves are bounded by the two horizontal dashed lines determined by the number of inflexible zealot robots. b Using the voter model, the swarm is unable to reach a decision when 20 of the robots are zealots. c Instead, the cross-inhibition model is able to quickly converge to a consensus and maintain it for the entire length of the experiment. d Just 4 zealots—that is, two inflexible robots for each option—are enough to destabilise the agreement of a swarm of 96 robots that cooperate through the voter model. e The cross-inhibition model can also suffer instability when a large proportion of robots are zealots. In this experiment, there are 30 zealots and 70 cooperative robots.

Accuracy and reward in the best-of-n problem

In robotics, inspired by social insects’ behaviour16,28, decentralised decision models have been investigated in the context of the best-of-n problem33,34,35. The opinion of the individuals refers to an option which has a certain quality. While individuals make estimates of the quality that are subject to noise and often inaccurate, the group is able to agree on the option that is better evaluated on average by the population. Therefore, in the best-of-n problem, the goal is not only to reach a consensus but also to agree on the best option. To solve the best-of-n problem, the voter model has been generalised into the weighted voter model36. Individuals vote at a rate proportional to the estimated quality. The individuals directly switch their opinion in response to others’ votes, following the same finite-state machine of the voter model of Fig. 1a. As votes are more frequent for better options, the result is that the population reaches a consensus on the option that individuals have self-estimated to be the best. By relying on the same strategy of quality-proportional voting, the cross-inhibition model has also been presented in a generalised form to tackle the best-of-n problem in artificial distributed systems17. Comparing the two models from the point of view of the best-of-n problem, we find that the two models maximise different metrics (Fig. 4). The weighted voter model demonstrates higher accuracy, that is, the ability to select the best option, while cross-inhibition can yield a higher reward (as discussed after). Figure 5a illustrates the ability of the weighted voter model to always converge to the best option in a relatively short time, despite the system being initialised with a consensus for the inferior option. In real systems, random fluctuation may lead the population to an agreement in favour of the lower-quality alternative. Through the weighted voter model, inaccurate decisions can be reverted. However, especially when the options’ qualities are similar, the dynamics of the voter model are highly unstable (e.g., see the high spread of the red shade in Fig. 4a) and can only grant a marginal majority for the best option (see also Fig. 5c). Instead, the cross-inhibition model, by granting more stability to the collective agreement, can lessen accuracy by locking the population in a consensus in favour of the lower-quality alternative (Fig. 5b).

Fig. 4: Comparison between the voter and the cross-inhibition models in best-of-n problems.
figure 4

Long-term dynamics for decisions between options with asymmetric qualities (quality ratio q = qx/qy ≠ 1) by populations with an infinite size (ODEs) and a finite size of S = 200 individuals (master equation). The main plots show the ODEs' fixed points (solid green lines are stable and dashed blue lines are unstable) in overlay to the Gillespie’s stochastic simulation algorithm (red heat-map, 104 runs per condition) for increasing levels of symmetric asocial behaviour (i.e. number of zealots equally split among the two options z = zx = zy). The inset of each panel shows the stationary probability distribution in log-scale computed with the analytical solution of the master equation at equilibrium, as from Eqs. (19) and (31). a The weighted voter model fails to reach a stable consensus when the two options have a similar quality, q = 1.05: the population is either highly unstable (large fluctuations with few zealots) or undecided (polarised population with numerous zealots). b Only for a higher quality ratio, q = 2, the weighted voter model has a stable and large majority. c, d Through cross-inhibition, the population can make a decision, despite the presence of a relatively large proportion of zealots, in any condition, both for small (c, q = 1.05) or large (d, q = 2) quality differences. However, for small quality differences (c), the high stability of the cross-inhibition model prevents the population to switch decision and can maintain a large consensus for the option with inferior quality. A stable majority for the inferior quality vanishes when either the quality difference is large (e.g., in d for q = 2) or, counterintuitively, when the proportion of zealot increases.

Fig. 5: Mean switching time and stability analysis for collective decisions between options with asymmetric qualities (quality ratio q = qx/qy ≠ 1).
figure 5

a The mean switching time (MST) indicates the mean time necessary to reach a stable majority for the best option, when the population is initialised with a consensus for the option with lower quality (500 runs of the Gillespie’s stochastic simulation algorithm, 95% confidence interval indicated with colour-shades that are often smaller than the line-width). The voter model can quickly overturn the consensus for the inferior option. Instead, using the cross-inhibition model, the population is more rarely able to change its decision in favour of the best option in a reasonable time. The MST has low values when a large number of zealots are present, or when the quality difference is high. In the other cases, the cross-inhibition locks a large majority into a stable consensus for either option, which can also be the inferior one. b The fold bifurcation point of the cross-inhibition model shows that by increasing asocial behaviour, the system transits from a phase of bi-stability to a phase with a single stable point. This phase transition threshold decreases with increased quality ratio q. The inset shows that increasing z removes bistability but also makes the majority smaller. c Nevertheless, cross-inhibition has always a larger majority than the weighted voter model, except for extremely high proportions of zealots, here illustrated as the stable fixed point of populations x + zx from the ODEs systems of Eqs. (6) and (7).

Finding that the use of inhibition reduces collective accuracy is in contrast with extensive literature that shows inhibition as a pervasive mechanism in most natural systems that seek collective consensus16,19,26,28,37. Such discord can be reconciled by looking at a different metric. Rather than maximising accuracy, the cross-inhibition model looks better suited to yielding a high reward rate38. When measuring accuracy, only the selection of the highest quality option is labelled as the correct (accurate) decision, while selecting any other option is incorrect. Instead, in value-based decisions, the magnitude of the reward received by the decision-makers is used to measure the value of the chosen alternative. On the one hand, considering accuracy can simplify the results’ analysis33,39. On the other hand, reasoning in terms of value-based decisions—in which the group benefits from the reward of the chosen option regardless of whether it is the absolute best in the environment—is more appropriate in most applications, both in engineered and biological systems (e.g., nutritional value during foraging, site quality during house-hunting, or aggregation location for the robot swarm)38,40,41,42. As recent work has shown, the optimal strategy for value-based decisions can differ from normative accuracy-based decision policies43, especially in decisions between options with equal qualities41. Our analysis shows that, in line with the optimal policy43, the cross-inhibition model converges to the highest quality option when the quality difference between options is large (Fig. 4d), and it trades accuracy for stability when the options have similar qualities (Figs. 4c and 5a).

The benefits of inhibitory signals

This study shows the inability of the voter model—both classical and weighted—to reach group coordination when symmetric noise or asocial behaviour is present, even in minimal quantities. In both artificial and natural systems, it is well justified to assume individuals may not always follow social pressure and sometimes act independently due to, for example, information acquired through individual exploration24 or sensorial failures44. The success of the voter model is due to its mathematical simplicity and tractability, however, for scientific progress, the fields of opinion dynamics and sociophysics require their models to include more aspects from real systems45. The cross-inhibition model is an alternative to the voter model that includes a widespread mechanism in nature: inhibitory signalling between individuals with different opinions. Such a simple change in individual behaviour revealed a dramatic change in the collective dynamics of the swarm, enabling system stability and symmetry-breaking. Mathematically, the cross-inhibition model has similarities with the binary naming game3 which is also capable of breaking decision deadlocks in the presence of symmetric inflexible populations46. However, the rules of the naming game can be extended to problems with a larger number of options in a way that is fundamentally different from the cross-inhibition model, and this makes its analysis difficult due to the combinatorial explosion of the number of states47. In this article, we study the minimal formulation of a model that includes cross-inhibition and benefits from mathematical simplicity, as also shown in previous analytical studies that investigated the scalability of the cross-inhibition model to higher dimension problems18. Our analysis shows that, with cross-inhibition, the collective performance is enhanced, rather than hampered, by noise (or zealots). Figure 4 shows that increasing asocial behaviour to moderate values removes the possibility of inaccurate decisions while still granting a consensus decision. This finding is in agreement with previous research that showed that the emergence of group coordination, in humans, animals, or robots, is facilitated by noise9,48,49. This study goes in the direction of building opinion dynamics models that gradually include more biologically plausible mechanisms while preserving mathematical simplicity.

Methods

Chemical reactions

Both models, illustrated in the motifs of Fig. 1a, b can be written as a set of chemical reactions. The chemical reaction defines transition rates between the states in which an individual can be. In our model, we define the following five states: commitment for options X and Y, no opinion U, and zealots in favour of either option, ZX and ZY.

The voter model reads as

$$\begin{array}{lll} X + Y \mathop{\rightarrow}\limits^{ q_x} X + X \quad& X + Y \mathop{\rightarrow}\limits^{ q_y } Y + Y \quad& {{{\mbox{(direct switching)}}}}\\ Y \mathop{\rightarrow}\limits^{ \sigma } X \hfill & X \mathop{\rightarrow}\limits^{ \sigma } Y \hfill & {{{\mbox{(noise)}}}} \hfill \\ Y + Z_X \mathop{\rightarrow}\limits^{ q_x } X + Z_X & X + Z_Y \mathop{\rightarrow}\limits^{ q_y } Y + Z_Y & {{{\mbox{(zealots)}}}}.\hfill \end{array}$$
(1)

The rates of transitions resulting from interactions between individuals with different commitments (labelled in Eq. (1) as direct switching and zealots) are proportional to the option’s qualities qx and qy, for recruitment to options X and Y, respectively. When noise is absent σ = 0, there are no zealots ZX = ZY = 0, and the qualities are equal q = qx/qy = 1, the model reduces to the classical voter model7,8. The system with q ≠ 1 is a generalisation of the voter model in which individuals modulate their voting probability as a function of the option’s quality; it has been named the “weighted voter model”36 and has been introduced to make collective decisions that solve the best-of-n problem. Including noise σ > 0 in the voter model has been studied as the “noisy voter model”14. Noise can represent any form of independent (asocial) behaviour that leads the individual to change their opinion, for example, self-sourced information. Therefore, the results from our model analysis can also be relevant to study how the frequency of using personal and social information, σ and (qx + qy), impacts the system dynamics. Zealots never change their opinion but influence the opinion of others13,50. The impact of these individuals on the voter model has been shown to be equivalent to the impact of independent behaviour (i.e. noise), see ref. 14 and our discussion below.

The cross-inhibition model reads as

$$\begin{array}{lll} X + Y \mathop{\rightarrow}\limits^{ q_x } X + U \hfill & X + Y \mathop{\rightarrow}\limits^{ q_y } U + Y \hfill & {{{\mbox{(cross-inhibition)}}}}\\ U + X \mathop{\rightarrow}\limits^{ q_x } X + X \hfill & U + Y \mathop{\rightarrow}\limits^{ q_y } Y + Y \hfill & {{{\mbox{(recruitment)}}}} \hfill \\ Y \mathop{\rightarrow}\limits^{ \sigma } U \quad U \mathop{\rightarrow}\limits^{ \sigma } Y \hfill & X \mathop{\rightarrow}\limits^{ \sigma } U \quad U \mathop{\rightarrow}\limits^{ \sigma } X \hfill & {{{\mbox{(noise type 1)}}}} \hfill \\ Y \mathop{\rightarrow}\limits^{ \sigma } X \hfill & X \mathop{\rightarrow}\limits^{ \sigma } Y \hfill & {{{\mbox{(noise type 2)}}}} \hfill \\ Y + Z_X \mathop{\rightarrow}\limits^{ q_x } U + Z_X & X + Z_Y \mathop{\rightarrow}\limits^{ q_y } U + Z_Y & {{{\mbox{(zealots)}}}} \hfill \\ U + Z_X \mathop{\rightarrow}\limits^{ q_x } X + Z_X & U + Z_Y \mathop{\rightarrow}\limits^{ q_y } Y + Z_Y & {{{\mbox{(zealots)}}}}.\hfill \end{array}$$
(2)

The cross-inhibition model17,18 is a variation of the weighted voter model that includes a null state (U) on indecision as a necessary transition step between any other state51. We consider two alternative types of noise. Noise type 1 follows the transition schema of the model by letting the individual go through the state U before adopting a new opinion (see dashed lines in Fig. 1b). To conduct a more complete comparison of the two models, we also consider noise type 2 which corresponds to the way noise is implemented in the voter model (see dotted lines in Fig. 1b).

Ordinary differential equations

The change in the size of the populations committed to X and Y can be studied through a system of ODEs. We indicate with \({S}_{X},{S}_{Y},{S}_{U},{S}_{{Z}_{X}}\), and \({S}_{{Z}_{Y}}\) the sizes of the populations in the state indicated in the subscript. As we only consider a symmetric number of zealots, we have \({S}_{{Z}_{X}}={S}_{{Z}_{Y}}={S}_{Z}\). For a total population of S individuals (i.e. S = SX + SY + SU + 2SZ), we represent the proportions of individuals in each opinion state with x = SX/S, y = Sy/S, u = SU/S, and z = SZ/S (note that the total proportion of zealots in the population is 2z). We consider only one type of asocial behaviour at a time. Therefore we first introduce a generic ODE system with the term α indicating the asocial behaviour that can be then substituted with either noise or zealotry. Assuming a large population, for S → , the time derivative of x for the voter model is

$$\frac{{{{{{{{\rm{d}}}}}}}}x}{{{{{{{{\rm{d}}}}}}}}t}=xy({q}_{x}-{q}_{y})+\alpha .$$
(3)

When we consider asocial behaviour in the form of noise, α = σ(y − x); instead, when we consider zealot behaviour, we have α = z(qxy − qyx). The two alternative systems are

$$\,{{\mbox{weighted voter model with symmetric noise}}}\,\,\sigma \quad \frac{{{{{{{{\rm{d}}}}}}}}x}{{{{{{{{\rm{d}}}}}}}}t} =xy({q}_{x}-{q}_{y})+\sigma (y-x);$$
(4)
$$\,{{\mbox{weighted voter model with symmetric zealots}}}\,z\quad \frac{{{{{{{{\rm{d}}}}}}}}x}{{{{{{{{\rm{d}}}}}}}}t}={q}_{x}y(x+z)-{q}_{y}x(y+z).$$
(5)

In agreement with previous research14, it is straightforward to see how independent behaviour (noise) and zealots can be mathematically equivalent, that is, when σ(y − x) = z(qxy − qyx). The equivalence is present when either the options have equal quality, q = qx/qy = 1, or when there are different rates for noise in the two populations, i.e. σx ≠ σy. In our study, we keep the two models distinct in two ways. While the number of zealots is symmetrically distributed among the two alternatives, zealots recruit peers with rates proportional to their option’s quality, instead, noise is an asocial component that has a symmetric contribution for both options. Additionally, zealots are included as part of the total population S; therefore, increasing the number of zealots indirectly reduces the number of flexible individuals (that can change opinion), as SX + SY = S − 2SZ; instead, increasing noise has no effect on the total number of flexible individuals SX + SY = S (e.g., noise σ can be the frequency of self-sourcing information24). We find it more appropriate to consider zealots as part of the total population S rather than an external source of influence. Considering different models helps in showing that the results of our analysis are consistent across different types of symmetric asocial behaviour.

The stability analysis of Eqs. (4) and (5) can be simplified by replacing the proportion y with y = 1 − x − 2z and, in the latter, by diving both sides by qy (note that the latter operation results in scaling the time by the quantity qy, that is τ = qyt). We define the quality ratio q = qx/qy. In this way, Eq. (5) can be rewritten as

$$\frac{{{{{{{{\rm{d}}}}}}}}x}{{{{{{{{\rm{d}}}}}}}}\tau }=q(1-x-2z)(x+z)-x(1-x-z).$$
(6)

The ODE system for the cross-inhibition model with generic asocial behaviour reads as

$$\left\{\begin{array}{l}\frac{{{{{{{{\rm{d}}}}}}}}x}{{{{{{{{\rm{d}}}}}}}}\tau }=x(qu-y)+{\alpha }_{x}\quad \\ \frac{{{{{{{{\rm{d}}}}}}}}y}{{{{{{{{\rm{d}}}}}}}}\tau }=y(u-qx)+{\alpha }_{y}.\quad \end{array}\right.$$
(7)

where u = 1 − x − y − 2z. When we instantiate asocial behaviour in the form of zealot behaviour, αx = z(qu − x) and αy = z(u − qy), and we obtain

$$\left\{\begin{array}{l}\frac{{{{{{{{\rm{d}}}}}}}}x}{{{{{{{{\rm{d}}}}}}}}\tau }=qu(x+z)-x(y+z)\quad \\ \frac{{{{{{{{\rm{d}}}}}}}}y}{{{{{{{{\rm{d}}}}}}}}\tau }=u(y+z)-qy(x+z),\quad \end{array}\right.$$
(8)

The cross-inhibition models with the two types of noise are presented in Supplementary Note 1.

Stability analysis

The voter model in both conditions—with noise or zealots—has two fixed points, and only one of them is stable and in the positive plane x ∈ [0, 1]. In the case of the voter model with noise (i.e. σ > 0 and z = 0), the stable fixed point is

$${x}^{* }=\frac{1}{2}+\frac{\sqrt{{({q}_{x}-{q}_{y})}^{2}+4{\sigma }^{2}}-2\sigma }{2({q}_{x}-{q}_{y})}.$$
(9)

In the case of the voter model with zealots (i.e. σ = 0 and z > 0), the stable fixed point is

$${x}^{* }=\frac{1}{2}+\frac{\sqrt{{(q-1)}^{2}+z[{(q+1)}^{2}z-2{(q-1)}^{2}]}-z(3q-1)}{2(q-1)}.$$
(10)

Note that Eqs. (9) and (10) only exist for q ≠ 1 (that is, qx ≠ qy); instead, when q = 1, in both conditions, the ODE system remains in decision deadlock with x = y = 0.5. The analysis of the fixed point for q ≠ 1 reveals that for relatively low values of noise, or zealotry, the system can reach a large majority only when the quality difference is large, i.e. q ≫ 1 or symmetrically q ≈ 0. Figures 2a–d, 4a, b, and 5c show the predicted agreement levels, according to Eqs. (9) and (10) for various values of the quality ratio q, the noise level σ, and the proportion of zealots z.

The cross-inhibition model has four fixed points that change their stability as a function of the system’s parameters. For the symmetric quality case, q = 1, with zealots z ≥ 0, the fixed points (pre-bifurcation) are

$${x}^{* }=\frac{1}{2}\left(1-3z\pm \sqrt{5{z}^{2}-6z+1}\right);\qquad {y}^{* }=\frac{1}{2}\left(1-3z\mp \sqrt{5{z}^{2}-6z+1}\right).$$
(11)

For the asymmetric case, q ≠ 1, the results are more complex. While it is possible to derive the fixed points’ mathematical equations and their stability conditions through standard software (e.g., see the Mathematica notebook provided in the Code Availability Statement), the complexity of the equations makes it impossible to study them from their mathematical form. Instead, we can plot the dynamics for representative cases and so interpret the general dynamics of the system. In particular, in Fig. 5c, we display the fixed point of population x for both the cross-inhibition and the voter models so that we can appreciate the differences between the two. It is noteworthy that (except for extremely high values of z), the cross-inhibition model always grants a larger majority in favour of the best option than the voter model. However, as shown in Figs. 4c, d, and 5b, the cross-inhibition model can have more than one stable state, leading to a consensus for the inferior alternative. Counterintuitively, the lower branch, in favour of the inferior alternative, is only present for low values of asocial behaviour and vanishes as the number of zealots increases. Therefore, it appears that zealotry can improve the accuracy of the cross-inhibition model by removing the possibility of selecting the inferior option. The minimum quantity of zealot behaviour z to make the lower branch vanish decreases as the quality difference increases. This effect can be measured by plotting the point of the fold bifurcation as a function of the quality ratio q; Fig. 5b shows that the parameter space where the lower branch is present reaches the maximum z = 0.2 for q = 1 and rapidly decreases as q gets larger than 1.

Master equations

The dynamics of finite-sized systems can be studied through master equations. Here, for both models, we present the master equations and find their analytical solution at equilibrium (that is, the stationary probability distribution, SPD), finally, we illustrate the numerical analysis employed to study the transitory temporal dynamics.

Master equation of the weighted voter model

We first define the rates at which transitions occur: \({T}_{x = k}^{+x}\) is the rate by which the population committed to X, with size SX = k, increases by one individual, and \({T}_{x = k}^{-x}\) is the rate by which it decreases by one individual. We recall that the total population has size S which is composed of subpopulations in different states: either committed to options X or Y with sizes SX and SY, or in state zealot with sizes \({S}_{{Z}_{X}}\) and \({S}_{{Z}_{Y}}\), that in our study, we consider symmetric \({S}_{{Z}_{X}}={S}_{{Z}_{Y}}={S}_{Z}\). Therefore, the system is completely specified (fully defined), computing only the changes for the population committed to X because SY = S − SX − 2SZ. In the voter model, the rate \({T}_{x = k}^{+x}\) describes when an individual in state Y directly switches to X after an interaction with an individual committed to X (either a susceptible individual in state X or a zealot in state ZX), which happens with a frequency proportional to option X’s quality qx. Otherwise, when there are no zealots SZ = 0, the transition can also be caused by asocial noise σ > 0. Therefore,

$${{\mbox{with noise:}}}\,{T}_{x = k}^{+x} =(S-k)\left({q}_{x}\frac{k}{S-1}+\sigma \right);\\ {{\mbox{with zealots:}}}\,{T}_{x = k}^{+x} =(S-k-2{S}_{Z})\,{q}_{x}\left(\frac{k+{S}_{Z}}{S-1}\right).$$
(12)

Symmetrically, the population committed to X decreases by one when one of its individuals (state X) directly switches to Y after an interaction with a susceptible individual in state Y or a zealot in state ZY (with a frequency proportional to option Y’s quality qy), or alternatively through asocial noise σ > 0. The reduction rate \({T}_{x = k}^{-x}\) is

$${{\mbox{with noise:}}}\,{T}_{x = k}^{-x} =k\left({q}_{y}\frac{S-k}{S-1}+\sigma \right);\\ {{\mbox{with zealots:}}}\,{T}_{x = k}^{-x}=k\,{q}_{y}\left(\frac{S-k-2{S}_{Z}+{S}_{Z}}{S-1}\right).$$
(13)

The system state is characterised by the probability Px=k(t) that the number of individuals in state X at time t is SX = k. Through the rates of Eqs. (12) and (13), we can define the master equation of the weighted voter model, which describes how the probability Px=k(t) changes over time:

$$\frac{{{{{{{{\rm{d}}}}}}}}{P}_{x = k}(t)}{{{{{{{{\rm{d}}}}}}}}t}={T}_{x = k-1}^{+x}{P}_{x = k-1}^{}(t)+{T}_{x = k+1}^{-x}{P}_{x = k+1}^{}(t)-({T}_{x = k}^{+x}+{T}_{x = k}^{-x}){P}_{x = k}^{}(t).$$
(14)

The first two terms of the rhs of Eq. (14) account for processes in which the number of individuals in state X after the event equals SX = k, while the last term accounts for the complementary loss processes where SX increases or decreases of one unit (i.e. (SX = k) → (SX = k + 1) or (SX = k) → (SX = k − 1), respectively).

Stationary probability distribution of the weighted voter model

In stationary conditions, the master equation reaches an equilibrium, that is \(\frac{{{{{{{{\rm{d}}}}}}}}{P}_{x = k}(t)}{{{{{{{{\rm{d}}}}}}}}t}=0\), and therefore the probability distribution is stable over time. The stationary master equation obtained from Eq. (14) is

$${T}_{x = k-1}^{+x}{P}_{x = k-1}^{* }+{T}_{x = k+1}^{-x}{P}_{x = k+1}^{* }-\left({T}_{x = k}^{+x}+{T}_{x = k}^{-x}\right){P}_{x = k}^{* }=0,$$
(15)

where \({P}_{x = k}^{* }={{{\mbox{lim}}}}_{t\to \infty }{P}_{x = k}(t)\) indicates the stationary solution that is independent of time t. As the process can be considered time reversible at equilibrium conditions, the SPD can be derived from the detailed balance principle, which yields

$${T}_{x = k-1}^{+x}{P}_{x = k-1}^{* }={T}_{x = k}^{-x}{P}_{x = k}^{* }.$$
(16)

By iterating Eq. (16), we obtain the SPD:

$${P}_{x = k}^{* }={P}_{x = 0}^{* }\mathop{\prod }\limits_{j=0}^{k-1}\frac{{T}_{x = j}^{+x}}{{T}_{x = j+1}^{-x}},$$
(17)

where the normalisation condition \(\mathop{\sum }\nolimits_{k = 0}^{S-2{S}_{Z}}{P}_{x = k}^{* }=1\) allows us to compute \({P}_{x = 0}^{* }\) as

$${P}_{x = 0}^{* }=\frac{1}{1+\mathop{\sum }\nolimits_{k = 1}^{S-2{S}_{Z}}\mathop{\prod }\nolimits_{j = 0}^{k-1}\frac{{T}_{x = j}^{+x}}{{T}_{x = j+1}^{-x}}}.$$
(18)

Note that the summation on the denominator goes to a maximum of S − 2SZ, which is the maximum size that the population SX can be. It is also obvious that the complementary probability \({P}_{y = k}^{* }\) for the population committed to option Y can be computed as \({P}_{y = k}^{* }=1-{P}_{x = S-k-2{S}_{Z}}^{* }\). The expanded forms of Eq. (17) for both asocial mechanisms (noise and zealotry), with the rates from Eqs. (12) and (13) are presented in Supplementary Note 2.

The SPD of the difference of the populations committed to the two options is illustrated in Figs. 1e, 2a–d, and 4a, b, is computed as

$${P}_{x-y = k}^{* }={P}_{x = \frac{S-2{S}_{Z}-k}{2}}^{* }.$$
(19)

Master equation of the cross-inhibition model

For the cross-inhibition model, the increment and reduction rates necessary to define the master equation are \({T}_{x = a,y = b}^{+x},{T}_{x = a,y = b}^{-x},{T}_{x = a,y = b}^{+y}\), and \({T}_{x = a,y = b}^{-y}\). These rates are increments (superscript +x or +y) or reductions (superscript − x or −y) of one individual in the population committed to X (+x or −x) or to Y (+y or −y) given that the committed populations are SX = a and SY = b. Here, we only describe rates for zealots and noise of type 1 (see Eq. (2)) because the SPD of the master equation with noise type 2 cannot be computed analytically using the detailed balance principle due to transitions happening both from state U to states X and Y, and directly between states X and Y. Therefore, we only conducted numerical analysis for the cross-inhibition model with noise type 2 (see Section on the stochastic simulation algorithm). For the other two types of asocial behaviours, increments by one individual in the committed populations, SX or SY, occur when individuals of that population, or zealots for that option, recruit an uncommitted individual; otherwise, through asocial noise. Rates \({T}_{x = a,y = b}^{+x}\) and \({T}_{x = a,y = b}^{+y}\) are

$$\begin{array}{lll}{{\mbox{with noise type 1:}}} & {T}_{x = a,y = b}^{+x}=(S-a-b)\left({q}_{x}\frac{a}{S-1}+\sigma \right); & {T}_{x = a,y = b}^{+y} =(S-a-b)\left({q}_{y}\frac{b}{S-1}+\sigma \right);\\ {{\mbox{with zealots:}}} &{T}_{x = a,y = b}^{+x} ={q}_{x}(S-a-b-2{S}_{Z})\left(\frac{a+{S}_{Z}}{S-1}\right); & {T}_{x = a,y = b}^{+y}={q}_{y}(S-a-b-2{S}_{Z})\left(\frac{b+{S}_{Z}}{S-1}\right).\end{array}$$
(20)

Instead, reduction by one individual from populations committed to X or Y occurs when individuals committed to different options interact with each other or spontaneously through asocial noise. Rates \({T}_{x = a,y = b}^{-x}\) and \({T}_{x = a,y = b}^{-y}\) are

$$\begin{array}{lll}{{\mbox{with noise type 1:}}} & {T}_{x = a,y = b}^{-x}=a\left({q}_{y}\frac{b}{S-1}+\sigma \right); & {T}_{x = a,y = b}^{-y}=b\left({q}_{x}\frac{a}{S-1}+\sigma \right)\\ {{\mbox{with zealots:}}} & {T}_{x = a,y = b}^{-x}=a\,{q}_{y}\left(\frac{b+{S}_{Z}}{S-1}\right); & {T}_{x = a,y = b}^{-y}=b\,{q}_{x}\left(\frac{a+{S}_{Z}}{S-1}\right).\end{array}$$
(21)

The master equation for the cross-inhibition model describes the change over time of the probability Px=a,y=b(t), which indicates that the number of individuals committed to option X and Y at time t are SX = a and SX = b, respectively. Because each individual (except for the zealots that do not change state) can be in three possible states—committed to X, committed to Y, or uncommitted (state U)—we have SU = S − SX − SY − 2SZ and considering the sizes SX and SY is sufficient to completely specify (i.e. to fully define) the system. The master equation is

$$\frac{{{{{{{{\rm{d}}}}}}}}{P}_{x = a,y = b}(t)}{{{{{{{{\rm{d}}}}}}}}t} = \, {T}_{x = a-1,y = b}^{+x}{P}_{x = a-1,y = b}(t)+{T}_{x = a+1,y = b}^{-x}{P}_{x = a+1,y = b}(t)\\ +{T}_{x = a,y = b-1}^{+y}{P}_{x = a,y = b-1}(t)+{T}_{x = a,y = b+1}^{-y}{P}_{x = a,y = b+1}(t)\\ -\left({T}_{x = a,y = b}^{+x}+{T}_{x = a,y = b}^{-x}+{T}_{x = a,y = b}^{+y}+{T}_{x = a,y = b}^{-y}\right){P}_{x = a,y = b}(t).$$
(22)

Stationary probability distribution of the cross-inhibition model

We compute the stationary solution \({P}_{x = a,y = b}^{* }={{{\mbox{lim}}}}_{t\to \infty }{P}_{x = a,y = b}(t)\) of the master equation when the dynamics have converged (\(\frac{{{{{{{{\rm{d}}}}}}}}{P}_{x = a,y = b}(t)}{{{{{{{{\rm{d}}}}}}}}t}=0\)) and the probability distribution is independent of time. Because there are no direct transitions from state X to state Y and vice versa, but the agents always pass through the uncommitted state U, each transition consists in a change of one individual (increase or reduction) either in A or in B, and it can never change both populations at once. Therefore, we can employ the detailed balance principle and obtain

$${T}_{x = a-1,y = b}^{+x}{P}_{x = a-1,y = b}^{* }={T}_{x = a,y = b}^{-x}{P}_{x = a,y = b}^{* },\quad \,{{\mbox{and}}}\,$$
(23)
$${T}_{x = a,y = b-1}^{+y}{P}_{x = a,y = b-1}^{* }={T}_{x = a,y = b}^{-y}{P}_{x = a,y = b}^{* }.$$
(24)

By iterating Eqs. (23) and (24), we can compute, respectively,

$${P}_{x = a,y = 0}^{* }={P}_{x = 0,y = 0}^{* }\mathop{\prod }\limits_{j=0}^{a-1}\frac{{T}_{x = j,y = 0}^{+x}}{{T}_{x = j+1,y = 0}^{-x}},\quad \,{{\mbox{and}}}\,$$
(25)
$${P}_{x = 0,y = b}^{* }={P}_{x = 0,y = 0}^{* }\mathop{\prod }\limits_{j=0}^{b-1}\frac{{T}_{x = 0,y = j}^{+y}}{{T}_{x = 0,y = j+1}^{-y}}.$$
(26)

Through Eq. (25) and iterating Eq. (24), we can compute the generic SPD equation:

$${P}_{x = a,y = b}^{* }={P}_{x = a,y = 0}^{* }\mathop{\prod }\limits_{j=0}^{b-1}\frac{{T}_{x = a,y = j}^{+y}}{{T}_{x = a,y = j+1}^{-y}}.$$
(27)

Finally, the value of \({P}_{x = 0,y = 0}^{* }\) is computed through the normalisation condition

$$\mathop{\sum }\limits_{a=0}^{(S-2{S}_{Z})}\mathop{\sum }\limits_{b=0}^{(S-2{S}_{Z}-a)}{P}_{x = a,y = b}^{* }=1,$$

which can be written as

$${P}_{x = 0,y = 0}^{* }+\mathop{\sum }\limits_{b=1}^{S-2{S}_{Z}}{P}_{x = 0,y = b}^{* }+\mathop{\sum }\limits_{a=1}^{S-2{S}_{Z}}\left({P}_{x = a,y = 0}^{* }+\mathop{\sum }\limits_{b=1}^{(S-2{S}_{Z}-a)}{P}_{x = a,y = b}^{* }\right)=1.$$
(28)

Replacing Eqs. (25)–(27) in Eq. (28), we obtain

$${P}_{x = 0,y = 0}^{* }= \Bigg[1 +\mathop{\sum }\limits_{b = 1}^{(S-2{S}_{Z})}\mathop{\prod }\limits_{j = 0}^{b-1}\frac{{T}_{x = 0,y = j}^{+y}}{{T}_{x = 0,y = j+1}^{-y}}\\ + \mathop{\sum }\limits_{a = 1}^{S-2{S}_{Z}}\mathop{\prod }\limits_{j = 0}^{a-1}\frac{{T}_{x = j,y = 0}^{+x}}{{T}_{x = j+1,y = 0}^{-x}}\left(1+\mathop{\sum }\limits_{b = 1}^{(S-2{S}_{Z}-a)}\mathop{\prod }\limits_{k = 0}^{b-1}\frac{{T}_{x = a,y = k}^{+y}}{{T}_{x = a,y = k+1}^{-y}}\right)\Bigg]^{-1}.$$
(29)

The expanded forms of Eq. (27) for both asocial mechanisms (noise and zealotry), with the rates from Eqs. (20) and (21), are presented in Supplementary Note 2.

The SPD for a single state is computed as the sum of the SPD for all values of the other state, that is,

$${P}_{x = a}^{* }=\mathop{\sum }\limits_{b=0}^{S-2{S}_{Z}-a}{P}_{x = a,y = b}^{* },\quad \,{{\mbox{and}}}\,\quad {P}_{y = b}^{* }=\mathop{\sum }\limits_{a=0}^{S-2{S}_{Z}-b}{P}_{x = a,y = b}^{* }.$$
(30)

Instead, when we display the SPD of the difference of the populations committed to the two options, for instance, in Figs. 1f and 4c, d, we compute the following quantity

$$ {P}_{x-y = k}^{* }= \mathop{\sum}\limits_{(a,b)\in {D}_{k}}{P}_{x = a,y = b}^{* },\\ {{\mbox{with}}}\,\; {D}_{k} = \{(a,b)\,| \,a,b\in {{\mathbb{N}}}^{0}\,\& \,a+b\le S-2{S}_{Z}\,\& \,a-b=k\}.$$
(31)

Stochastic simulation algorithm

We complemented the analytical form of the stationary probability distribution with its numerical approximation, which we computed through Gillespie’s stochastic simulation algorithm52. Supplementary Figure S2 shows that analytical and numerical solutions have a precise match for both the voter and the cross-inhibition model with both asocial mechanisms (noise and zealotry). As indicated earlier, we could not find the analytical solution of the SPD for the cross-inhibition model with noise type 2 because there are transitions between all three states and employing the detailed balance principle here becomes unpractical. Therefore, the results of Fig. 2g, h, which show the effect of noise type 2, are computed through the Gillespie algorithm (104 runs per condition). For every run, we initialise the system with a random initial condition and store the amount of time spent in each state throughout long runs (105 time-units, with transition rate speed defined as from Eqs. (1) and (2)). The time in each state is normalised by the total simulation length so that it approximates the stationary probability distribution52.

Mean switching time

We compute the mean switching time (MST) as the expected time to move from a state of full consensus in favour of Y—the option with lower quality—to a state of stable consensus for X, the option with superior quality. The MST illustrated in Fig. 5a is the average of 500 runs of Gillespie’s stochastic simulation algorithm, running one million time-units. A switch towards consensus for the superior option X happens when the subpopulation in favour of X has size \({S}_{X} > ={\hat{S}}_{X}\) for more than 10 time-units. We do not include these 10 time-units in the MST. The term \({\hat{S}}_{X}\) is the minimum size of the subpopulation committed to X to consider the system in a state of consensus for X. The size threshold \({\hat{S}}_{X}\) is calculated numerically from a very long run of the SSA (106 time-units) initialised at SX = S − 2SZ and SY = 0. While the population remains in the state of consensus for X, the actual size of the subpopulation SX fluctuates over time. Assuming a normal distribution of the fluctuations, we compute \({\hat{S}}_{X}\) as the average of SX when SX > (S − 2SZ)/2, subtracted by 3 times the standard deviation of the considered data points.

Swarm robotics experiments

We ran four experiments with a swarm of 100 Kilobots53, which are low-cost, small-sized robots that can move on a flat surface, display their internal state through a coloured LED light, and broadcast 9-byte infra-red messages to neighbours in a range of 10 cm (see Fig. 1d). Robot’s movements are limited to (approximately) straight motion at a speed of about 1 cm/s and rotation in place at about 40/s. In our experiments, by alternating straight motion and rotations, the robots diffused throughout the 1 × 1 m2 environment, using the same mobility model of35, and hence changed their neighbourhood over time. Every 30 seconds, the robot reads the last message received from its neighbours, that used to update its opinion through one of the two state machines of Fig. 1. As shown in previous research34,54, using a relatively low frequency of robot’s opinion update (30 s) compared with the random diffusion speed (1 cm/s) allows obtaining a qualitative good agreement between the macroscopic models—which assume a well-mixed interaction topology—and the robotic implementation—which relies on local interactions in a range of 10 cm. The robots showed their opinion via the coloured LED (option X as red, option Y as blue, and no opinion as green). The experiments were recorded using an overhead camera (the videos are available as Supplementary Movies 15), and the robots’ position and state—that is, their light’s colour—were tracked through the ARK system55. In every experiment, we included a number of zealot robots, which ran the same algorithm as the others except for refraining from updating their opinion. We began each experiment with the swarm opinion equally split, 50-50, between the two options. Each experiment lasted 60 minutes, and the algorithms run by the robots are open-source and available in the Supplementary Software (see Code Availability Statement).