1 Introduction

Building structures are among the most remarkable production processes we humans undertake. However, it is both costly and time-consuming. Consequently, integrating robots into construction processes can be of great benefit. Social insects provide important examples of collective behaviors such as creating large complex structures. These stem from simple behaviors of agents without centralized control or pre-planning. One prominent example is termites: millions of small insects successfully self-organize to build massive, complex mounds that sometimes exceed 12 m in height.

Inspired by natural swarms, researchers have started looking into swarm robotics systems that are able to construct increasingly complex structures  [1, 2, 38]. Similar to social insects, construction by a robot swarm involves agents arranging building materials in an environment to form structures. To do so, usually robots coordinate through stigmergy. In contrast to direct communication, stigmergy enables the coordination of the agents’ activities through changing their shared environment  [5, 8, 32]. Upon sensing such environmental changes, the agents conclude their next activity. For example, the availability of building material provides a cue for robots, triggering their decision-making process. Nevertheless, in some situations, stigmergy may not be a sufficient or appropriate means of communication; in these cases, direct communication becomes necessary to exchange particular pieces of information for successful task completion.

We base our study on the Swarm Robotics Construction System (SRoCS)  [3], a simulation of which is shown in Fig. 1(a). In this system, robots use computer vision to monitor other robots’ actions, and based on these observations, they perform predefined construction actions that advance to complete a partially built structure. In this paper, we consider a scenario where a swarm is divided into two groups of robots. The first group is responsible for exploring the environment, finding building material in form of building blocks, and transporting it to the construction site. This behavior is referred to as the foraging task. The second group is responsible for assembling the foraged blocks into a structure. This behavior is referred to as the construction task. We consider two groups of robots since some robots must remain in the cache area over extended periods of time to estimate the tile density. This estimation is paramount to effectively allocating the robots between the foraging and construction tasks.

Fig. 1.
figure 1

A structure being built using the swarm robotics construction system: (a) in the ARGoS simulator, and (b) an abstraction of the environment with the cache area (pink), and building blocks (black). (Color figure online)

In this study, we focus on the region surrounding the construction site where blocks can be temporarily placed for construction. We call this the cache area (pink in Fig. 1). In order for construction robots to perform their task efficiently, there must be enough blocks in the cache area. Such blocks are discovered and transported to the cache by the foraging robots. The goal in this study is to maintain an amount of blocks that maximizes the rate of construction. To achieve this, the swarm must find a suitable allocation between the robots adding blocks to the cache area (i.e., robots in the foraging task) and the robots removing blocks from the cache area (i.e., robots in the construction task) such that a certain amount of blocks is always preserved in the cache. To achieve this proper allocation, the construction robots need to collectively perceive the number of blocks in the cache area. This estimate helps the individual robots switch to foraging when the estimate drops below a given threshold, or to continue estimating otherwise. This collective perception process is highly challenging as it is performed in a dynamic environment, where the number of blocks in the cache area changes continuously. To address this challenge, we consider an abstraction of the construction scenario, where we can focus on the collective perception and decision-making dynamics involved in finding a suitable robots allocation between the two tasks. We use a homogeneous swarm, in which all robots are capable of performing either the construction task or the foraging task. The robots performing the construction task can communicate with each other if they are within range. Differently, robots performing the foraging task are unable to interact with the robots in the cache and vice versa. Once a robot switches to the foraging task, it always returns to the construction task with a new block after a period of time spent on foraging. Finally, the building material is assumed to be homogeneous and the maximum rate at which the building blocks can be attached to a structure is constant throughout the experiment.

The results of our collective perception process show that a swarm can collectively estimate and track the state of a dynamic environment, and use this information to find a suitable allocation between tasks. Furthermore, we show how parameters of the task and of the environment influence the accuracy of the collective perception process. Consequently, we discuss how our controller could be modified so that it is more resilient to these parameters. The remainder of this paper is organized as follows. In the next section, we provide a brief overview of the literature on collective perception and decision-making. In Sect. 3, we detail the robots’ behavior and the abstract environment in which we evaluate it. In Sect. 4, we summarize and discuss the key results. We conclude this paper in Sect. 5. The data generated by this research is available online as a project on the Open Science Framework.Footnote 1 This project includes all software components and documentation required to reproduce and to extend this research  [15].

2 Related Work

In robot swarms, collective decision-making has been extensively researched [10, 12, 21, 22, 25, 27,28,29, 33, 34, 37], due to its essential role in a wide range of tasks including foraging, flocking, and construction. In general, two types of system outputs are associated with a collective decision-making process: (i) a consensus in which the swarm agrees on a single option [35], and (ii) a division of the swarm among different tasks—i.e., task allocation— [6, 16, 19, 24]. The goal of the swarm in the consensus-achievement is to converge as quickly as possible on one option, referred to as a symmetry-breaking decision [11, 18], or to converge on the best option as quickly as possible, referred to as a value-sensitive decision [9, 23]. In the case of task allocation, the goal of the swarm is to maximize the system performance by minimizing the idle time of individual robots.

In this study, we tackle a specific process of collective decision-making that is referred to as collective perception. Collective perception is a process, in which the robots are engaged in perceiving an environmental stimulus collaboratively [14, 30, 31, 36]. In general, robots need to perceive particular signals in their environments to make decisions based on the perceived values. We assume that the environments where robot swarms are deployed are large so that the perception of a single robot is far from sufficient to sense a system-wide stimulus—i.e., a stimulus that spreads across a large space of the environment. This motivates the need for collective perception, where robots combine their perceptions in a distributed manner, and self-organize to act as a single unit.

Collective perception is observed in social insects such as honeybees, where individuals tend to evaluate, for instance, queuing delays to optimize their task allocation  [26]. Also bee foragers transfer their nectar load to multiple receivers suggesting the use of this behavior to estimate the environmental nectar flow  [13]. These studies inspired collective perception in artificial systems such as robot swarms. For example, authors in  [30] developed a bio-inspired algorithm enabling a robot swarm to aggregate at two locations, where the size of each group corresponds to the size of the selected location. In [36], the authors use a robot swarm to collectively decide which of two colors is the most represented in a pattern drawn on arena ground. The algorithm developed in  [36] was tested for benchmarking and generalization in  [4] across a larger number of patterns (nine). Contrary to  [36], the authors in  [4] find that the difficulty of the collective perception process doesn’t depend mainly on the ratio of one color to the other, but on the distribution of each color in the environment. The authors in  [7] proposed a distributed Bayesian algorithm to solve the collective perception task of a similar two-color environment. They define the speed vs. accuracy trade-off of the collective perception as a multi-objective optimization problem. Additionally, the authors have shown that it is possible to guarantee the accuracy of the collective perception, at the cost of decision time.

None of the aforementioned algorithms, however, support collective perception in a dynamic environment, where the perceived features change over time. Dynamic environments impose a serious challenge to collective perception algorithms, i.e. the rate at which the swarm reaches a consensus vs. the rate at which the environment changes. Also, contrary to other algorithms, our collective perception algorithm attempts to estimate absolute values of the perceived feature (e.g., the percentage of a particular color), instead of merely providing a decision on its relative properties (e.g., color x is represented more than color y).

3 The Model

We approach the collective construction problem using an abstract model, as depicted in Fig. 1(b). In the abstract model, we focus on the cache area, in which building blocks are modeled as 2D tiles. These tiles are moved from the foraging area to the cache area by robots performing the foraging task. Robots that are allocated to the construction task, explore the cache area and pick up tiles (when encountered) and use these to build a structure. We omit the details of the foraging process, and instead we replace it by a stochastic process that characterizes the retrieval of tiles. We define a lower-bound density of tiles in the cache area that we call the target density \(\varGamma \). This density enables us to minimize the idle time of constructing robots—i.e., to maximize the construction rate. We assume the density \(\varGamma \) to be known and provided to the swarm. The goal of the swarm is to allocate the robots to the foraging and construction tasks so that \(\varGamma \) is satisfied and maintained in the cache area.

3.1 The Retrieval Process of Tiles

We model the output of the foraging process—i.e., the retrieval of tiles—using a renewal process. This is a sequence of random variables, which are referred to as the arrival times, at which a repeating event occurs—i.e., retrieving a tile. The inter-arrival time is the period between two consecutive events, these in our study are two consecutive retrieval of tiles. We model these inter-arrival times by sampling from an exponential distribution with the density function:

$$\begin{aligned} f_T=\frac{1}{\lambda _f} e^{-\frac{1}{\lambda _f} T}, \end{aligned}$$
(1)

where the parameter \(\lambda _{f}\) is the average time a robot spent foraging before returning to the cache area with a new tile. Modeling the inter-arrival times to be exponentially distributed results in the renewal process to be a Poisson process, a common way to model arrival events [17, 20, 39]; therefore, the average number of tiles retrieved by foraging robots within the time period \(\delta t\) is given by:

$$\begin{aligned} \langle {M_i}(\delta t)\rangle = \lfloor {\dfrac{\delta t}{\lambda _{f}}} \rfloor N_{f \xrightarrow {}c}(\delta t), \end{aligned}$$
(2)

where \(N_{f \xrightarrow {}c}(\delta t)\) is the number of robots switching from foraging to construction in the time interval \(\delta t\). The value of this variable changes over time as a function of the number of robots in the cache and the estimated and target tile density.

3.2 The Simulated Environment

To evaluate our collective perception process, we use the ARGoS simulator to create a 4 \(\times \) 4 m\(^2\) arena that is divided into 2500 tiles. We run experiments with 80 robots that can drive around the arena, avoid obstacles such as walls and each other, sense whether or not they are driving over a shaded tile, and communicate with each other over wifi. We restrict the communication distance of the robot to be a maximum of one meter to prevent global communication in the swarm.

We simulate a robot switching to the foraging task by removing it from the simulation and setting up a countdown timer that is initialized following Eq. (1). The mean of this distribution \(\lambda _f\) in Eq. (1) is one of the key parameters which we vary in our experiments. Once the timer reaches zero, we simulate the robot switching back to the construction task by adding it back into the simulation. Using this strategy, the foraging robots are unable to communicate with the robots performing the construction task, which is consistent with our scenario.

When a robot returns from the foraging task, a cell in the arena is shaded to represent the tile this robot retrieved. We follow a probabilistic approach to place the retrieved tile in the cache. Our approach has the effect of creating clusters of tiles in the cache area (see Fig. 2). This is achieved by increasing the probability to select a candidate location x by a factor of \(\kappa =5\) for each tile that is adjacent (max. 8 tiles) to the location x. Hence candidate locations with more tiles in the neighborhood have a higher probability to be selected. Performing construction—i.e. removing a block from the cache and attaching it to a hypothetical structure—is simulated by a tile being unshaded. This transition occurs whenever a robot moves off a tile and onto another tile. The maximum number of tiles that can be unshaded per second is the construction limit \(\xi _c\), which is another key parameter that we vary in our experiments. A full simulation run with default parameters is hosted on Open Science Framework.Footnote 2

Fig. 2.
figure 2

Screenshot of a simulation in the cache that shows the tiles clustering effect. The magenta lines represent the communication links between the robots.

3.3 The Robot Behavior

A robot moves through the cache in a straight line unless it encounters an obstacle. In case of an obstacle, the robot turns on the spot until its heading is clear. When the robot is not avoiding obstacles, it samples the ground beneath it to determine whether it is on top of a tile. The robot keeps track of the total number of samples it has taken and the number of times a sample was taken on a tile. In addition to these counts, the robot’s memory also contains a table that records these counts received from the robot’s neighbors. An entry in this table contains three fields: (i) the neighbor’s total number of samples, (ii) the neighbor’s number of samples taken over a tile, and (iii) a time-to-live value that is used to drop the neighbor’s entry from the table when its value reaches zero.

At each time step, a robot adjusts the table in its memory by decrementing the time-to-live field for each entry. It then sends this table with an additional entry, representing its sample counts, to all of the robot’s direct neighbors. The time-to-live value in this additional entry is initialized to its maximum value. When a direct neighbor receives this information, it updates its table by replacing the contents of each neighbor’s entry with the one with the highest time-to-live value found among the existing entries in its table and the entries from the received messages. In this way, each robot always has the most up-to-date entry for each robot that it has recently communicated with. This time-to-live value is also used to avoid loops and to prevent duplicate entries in a robot’s table.

After a robot has updated its table, it uses both its local sample counts and the sample counts from its neighbors to estimate the tile density. To compute this estimate \(\langle \gamma _i(t)\rangle \) the robot i constructs a weighted average, in which the contribution of each entry (neighbor j’s information) is weighted by (i) the number of samples that neighbor has taken and (ii) the hop distance of that neighbor (calculated from the time-to-live field). This average includes i’s own sample counts weighted by the number of samples i took and a hop distance of one.

$$\begin{aligned} \langle \gamma _i(t)\rangle = \frac{\omega _i(t) \gamma _i(t) + \sum _{j\in N_i} \omega _j(t) \gamma _j(t) }{\omega _i(t) + \sum _{j\in N_i} \omega _j(t)}, \end{aligned}$$
(3)

where \(N_i\) is the set of robot i’s direct neighbors, and \(\gamma _i(t)\) is the tile density measured locally by robot i and defined as:

$$\begin{aligned} \gamma _i(t) = \dfrac{c_i(t)}{s_i(t)} \end{aligned}$$
(4)

where \(c_i(t)\) is the number of samples taken by robot i while driving over tiles, and \(s_i(t)\) is the total number of samples taken by robot i. The weight \(\omega _i\) assigned to the locally-measured density by robot i in Eq. (3) is defined as:

$$\begin{aligned} \omega _i(t)=s_i(t)h_{ij}(t) \end{aligned}$$
(5)

where \(h_{ij}(t)\) is the hop distance between robot i and robot j. The weighted average increases the influence of robots that have sampled larger areas of the arena and that are further away. The latter weighting makes the swarm more resilient against over-estimating the tile density which would otherwise occur due to the clustering of tiles.

After estimating the density of tiles, each robot i decides probabilistically to switch from the construction to the foraging task, as long as \(\langle \gamma _i(t)\rangle \) is lower than the target density \(\varGamma \). The switching probability \(Pr_i^{c \xrightarrow {} f}(t)\) of robot i at time step t is proportional to the difference between robot i’s estimate and \(\varGamma \):

$$\begin{aligned} Pr_i^{c \xrightarrow {} f}(t)= {\left\{ \begin{array}{ll} \eta |\varGamma -\langle \gamma _i(t)\rangle | &{} \quad \text {if} \langle \gamma _i(t)\rangle <\varGamma \\ 0 &{}\quad \text {otherwise}, \end{array}\right. } \end{aligned}$$
(6)

where \(\eta \) is a design parameter used to keep \(Pr_i^{c \xrightarrow {} f}(t)\) in the interval [0, 1].

4 Results and Discussion

We have investigated the performance of the collective perception process as well as the robots’ task allocation for different experiment configurations. Specifically, our results were obtained over the following set of parameters: (i) the mean foraging time \(\lambda _f\) (tested values \(\lambda _f \in \{5,10,20\}\)), (ii) the lower-bound of the target density \(\varGamma \) (tested values \(\varGamma \in \{0.1,0.3,0.4\}\)), (iii) the construction limit \(\xi _c\) (tested values \(\xi \in \{5,10,20\}\)), and (iv) the constant \(\eta \) in the switching probability as defined in Eq. (6) (tested values \(\eta \in \{0.1,0.15,0.2\}\)). We have published the data from these experiments online  [15]. The evaluation of our approach spans over four metrics. The first is the time trajectory of the density of tiles \(\rho _{gt}(t)\), which is the ground truth; the second is the time trajectory of the swarm estimate \(\rho _{s}(t)\); the third is the time evolution of the individual deviation from the swarm estimate of the tile density: \(\varDelta _i(t)= |\langle \gamma _i(t)\rangle - \rho _{s}(t)|\); and the fourth illustrates how the robots allocation to construction and foraging evolves over time. We run all experiments for 2 500 s (12 500 time steps) with a swarm size of 80 robots and average the results of each experiment across 30 runs. In the following we discuss our findings over a subset of the parameters’ tested values.

Let us start with the first metric, tile density. Figure 3 shows that the swarm was able to increase the tile density in the cache area and keep it above the target density \(\varGamma \) for \(\varGamma \in \{0.3,0.4\}\). The swarm estimate is initially in full agreement with the ground truth as both start at 0 tiles. Over time, the swarm estimate \(\rho _{s}(t)\) stabilizes around the target density \(\varGamma \), with a minority of robots (see the standard deviation) estimating the tile density to be higher than \(\varGamma \). This minority acts to reduce the number of robots sent to retrieve tiles, while the majority acts to increase this number, leading the ground truth to a value higher than the target density. The interplay of these two groups in the swarm causes both the swarm estimate and ground truth to stabilize with the ground truth higher than the swarm estimate. We also notice that increasing the mean foraging time (\(\lambda _f \in \{10,20\}\)) leads to a slower increase in \(\rho _{gt}(t)\). This is because higher values of \(\lambda _f\) imply longer periods between the retrieval of tiles, on average. Furthermore, we see that for a specific target density \(\varGamma \), \(\rho _{s}(t)\) seems to be maintained across different values of \(\lambda _f\) and \(\xi _c\). This suggests that the collective perception process is relatively robust to both parameters. This robustness implies that our algorithm is suitable for real-world construction tasks, where complicated foraging that takes longer to find building materials, or prolonged assembly of building material does not affect the collective perception performance.

Fig. 3.
figure 3

Tile density for \(\eta =0.2\) and target density (a) \(\varGamma = 0.3\) and (b) \(\varGamma = 0.4\)  swarm estimate \(\rho _{s}(t)\),  ground truth \(\rho _{gt}(t)\) (averaged across 30 runs).

Figure 4 illustrates the mean individual deviation \(\langle \varDelta _i(t) \rangle \) from the swarm estimate for different target densities \(\varGamma \in \{0.3,0.4\}\), mean foraging times \(\lambda _f \in \{10,20\}\), and construction limits \(\xi _c \in \{10,20\}\). Our results show robustness of the individual deviation with regard to changes in these parameters, with a maximum average deviation of 0.15 (\(\langle \varDelta _i(t) \rangle \le 0.15\)). Such a small variance \(\langle \varDelta _i(t)\rangle \) indicates a strong agreement between the individual and the group estimate.

Fig. 4.
figure 4

Average individual deviation \(\langle \varDelta _i(t) \rangle \) from swarm estimate \(\rho _{s}(t)\) for \(\eta =0.2\) and target density (a) \(\varGamma =0.3\) and (b) \(\varGamma =0.4\) (averaged across 30 runs).

Fig. 5.
figure 5

Fraction of foraging robots for \(\eta =0.2\) and target density (a) \(\varGamma = 0.3\) and (b) \(\varGamma = 0.4\) (averaged across 30 runs).

Finally, Fig. 5 shows the fraction of robots allocated to foraging over time. At the beginning, all 80 robots are in the cache area for all experiment configurations. Thus, initially, there is a jump in the number of robots leaving from the construction task to the foraging task. This jump is due to the low swarm estimate \(\rho _s(t)\) of the tile density during the first 100 s of the experiments (see Fig. 3). During this initial period, the estimate \(\rho _s(t)\) of the swarm remains consistently below the target density. Hence the condition \(\langle \gamma _i(t)\rangle <\varGamma \) is true for a large majority of the robots, that then switch to the foraging task following Eq. (6) with a relatively high probability. The magnitude of the spike increases with the mean foraging time \(\lambda _f\). This is due to the longer time it takes for the robots to arrive back from the foraging area, and thus for the tile density and the swarm estimate to rise. Nevertheless, as soon as the swarm estimate stabilizes and the ground truth of tiles in the cache reaches or exceeds its target \(\rho _{gt}(t) \ge \varGamma \), the fraction of the foraging robots starts to drop until it stabilizes, leading the system into an equilibrium state with respect to the robots allocation. The drop takes longer time for larger \(\lambda _f\). Furthermore, the fraction of robots that continue foraging is higher for higher \(\varGamma \) values. This is because tiles need to be retrieved at a faster rate. Additionally, the fraction of foraging robots is higher for larger \(\lambda _f\) given the same \(\varGamma \). This is due to the slower rate of tile retrieval when \(\lambda _f\) is larger. Thus, this slower rate pushes more robots to foraging while the robots that remain in the cache are performing construction and estimating the tile density \(\langle \gamma _i(t)\rangle \) for a longer time.

5 Conclusions

We studied an abstract scenario of a collective construction process by a robot swarm in a dynamic environment. In this scenario, we focused on the cache area—i.e., an area that surrounds the construction area—, where blocks are modeled as tiles that appear through foraging and disappear through construction. Robots can switch between two tasks: foraging and construction. Foraging robots explore the foraging area and retrieve the building material, while construction robots conduct a collective perception process to maintain a minimum tile density in the cache. The details of the foraging task are abstracted away and it is modeled using a Poisson process which delivers tiles with a specific rate (\(1/\lambda _f\)) to the cache. This enabled us to focus on research questions concerning the design of a task allocation mechanism that exploits collective perception in a dynamic environment. In future work, we plan to simulate a detailed foraging process. The collective perception process aims to assign robots to the foraging task to increase the tiles retrieval rate whenever the density in the cache drops below the target density \(\varGamma \)—i.e., the required lower-bound on the tile density. Robots in the cache rely on both their samples and the samples of their neighbors to compute an estimate of the tile density. This estimate is computed as a weighted average that assigns higher importance to (i) robots that are further away, making the swarm more resilient to over-estimation or under-estimation due to the clustering of tiles; (ii) robots with larger samples, as these contributions are more representative. Robots use their estimate from Eq. (3) to probabilistically decide whether to switch to the foraging task or to continue estimating/performing the construction task.

Our results show that the proposed collective perception process leads to a proper robots allocation, which in turn guarantees a minimum tile density in the cache. This allocation changes as a function of the average time it takes a robot to find and retrieve a tile \(\lambda _f\), and the target density \(\varGamma \). Furthermore, our results show a strong agreement between the individual estimate \(\langle \gamma _i(t)\rangle \) and the swarm estimate \(\rho _s(t)\), with a maximum variance of 0.15. This study is a first step towards designing collective perception processes in dynamic environments, in which the perceived feature (e.g., tile density) changes over time and the goal is to estimate its absolute value. In future work, we plan to study the influence of the size and update rate of the robots’ memorized samples on the performance of the perception process. We also intend to extend the proposed algorithm to enable the swarm to maintain a tile density close to its target.