1 Introduction

The task of collecting objects clustered at random locations and transporting them to a central depot can benefit from a decentralised solution. In contrast to a single large vehicle dedicated to load/unload all the objects, an interesting solution consists in having a large number of simple autonomous vehicles, or robots, each carrying a single object and coordinating with each other. Advantages of this solution are parallel exploration of the environment and possibility to distribute the resources among various source locations. Controlling the robot behaviour via a decentralised algorithm adds the advantage of scalability, by which the system throughput can be calibrated with increase/removal of robots without need for a system redesign. We study a decentralised solution that employs a swarm of simple robots that coordinate via stigmergic communication to collect items clustered in the environment in various source areas and to transport them to a central depot. Exploiting nearer sources would increase the system throughput, however objects may also have different qualities. We study how our system can balance the trade-off between quality and distance.

The task of finding and collecting objects is known in the multi-robot literature as foraging due to the resemblance to the activity that some animals perform when hunting food. In foraging terms, the source areas are food sources and the depot area is the animal’s nest. We exploit this analogy with biology to also design our solution. Inspired by foraging behaviour of ant colonies, we design a robot swarm that relies on a stigmergic communication medium similar to the pheromone trails used by ants [12, 26, 56, 60]. Initially, scout ants randomly search the environment, when one finds food she returns to the nest carrying a food item and leaving a pheromone trail on her path. Through this process, ants create pheromone trails between their nest and food sources located by scout ants. The pheromone trails are used by other members of the colony to avoid further random exploration and to exploit the sources that have been already found. Similarly, we design a solution where robots start to randomly search the environment and, later, they converge to exploiting the object sources by relying on stigmergic communication. It has been observed that ants modulate pheromone deposition as a function of the food source quality [22, 26, 60] (or of the nest-site quality during nest hunting [28]). In a similar fashion, our robots deposit pheromone proportionally to the estimated objects’ quality. In this study, we focus on the strategies to coordinate the robot motion leaving in abstract terms the object load/unload issues.

Previous work investigated the use of pheromone as a form of indirect communication between robots (see a review in Sect. 2). Our study includes analysis of the quality-distance trade-off which is an aspect that has not been previously explored in multi-robot foraging studies (see the problem description in Sect. 3). We implement a foraging swarm composed of simple robots (Sect. 3.1), the Kilobots [50], that operate in a virtual environment where they can deposit/sense virtual pheromone (Sects. 3.2 and 3.3). In Sect. 4, we evaluate the system performance through simulations and we employ the Augmented Reality for Kilobots (ARK) [47] system to showcase the functioning with a set of demos with swarms up to 100 Kilobots. We finally discuss the relevance of the work for engineering and biology in Sect. 5.

2 Related Work

Several studies employed a form of stigmergy similar to the ants’ pheromone trails to coordinate the robots’ movement. A pivotal point of these studies concerns the way in which pheromone is implemented, that is how the environment stores/updates the pheromone and how the robots deposit/sense pheromone in the environment. We identify and discuss three main categories which we name: beacon robots, smart-environment based, and on-board pheromone.

The first robotic systems that used pheromone communication to coordinate the group motion allocated a set of robots as static beacon robots [5, 10, 19, 25, 36, 39, 59]. The role of the beacons was to store and communicate pheromone levels to robots that moved in their surroundings. The advantage of this solution is that it can be implemented by simple robots in unknown unstructured environments. The drawback is that part of the robots are not actively contributing to the main task (e.g. foraging and collecting items) but need to stop and act as beacons. This strategy may limit the functioning in vast environments. Mobile robot beacons overcome the sacrifice of robots and allow robots to be concurrently beacons and active foragers [11, 53]. However, the correct functioning relies on the tuning of the swarm size and communication range as a function of the environment size.

Several studies, similarly to ours, implemented pheromone communication through a smart environment which was capable to store virtual pheromone information and to provide this information in real-time to the robots [1, 17, 18, 21, 54, 58]. Within this category, several studies implemented virtual pheromone through the use of RFID tags which were deployed in the environment and stored pheromone information [3, 23, 24, 29, 32, 33]. Our study relies on a different form of smart environment: Kilobots perceive and deposit virtual pheromone via ARK which has similarities with implementations of [1, 17, 18, 54]. Similarly to ARK, robots were real-time tracked with an overhead camera, although, differently from ARK, their robots used the light sensors to read the virtual pheromone that was projected as light on the floor. In [58], pheromone foraging was implemented on a Kilobot swarm using a different augmented reality system, the Kilogrid.

Researchers designed various solutions to equip robots with on-board sensors and actuators customised to mark the environment and thus shifted the pheromone mechanism from the smart-environment to the robots. In an early work [55], the robot used a marker pen to draw lines on the floor to improve its performance in the area coverage task. This technology had the drawback of not allowing evaporation or diffusion of pheromone. Differently, in [45], the robots could emit and read gas which was used to guide other robots towards a source area. A limitation of this work was the high volatility of the gas. In [34], the E-Puck robots were equipped with phosphorescent glowing paint to temporarily mark the environment. Robots had to operate in a dark environment and follow light to move between two areas. Finally, in [15, 16], robots used alcohol to mark the environment and improve the collective performance in the foraging task.

Most work discussed in this Section, as ours does, aims to implement a robotic system for the foraging task where robots are asked to move between two (or more) locations (mimicking the activity of objects collection). In our study, we include the aspect of objects’ quality that has not been taken in consideration earlier and we analyse how the system can balance the quality-distance trade-off. Previous work included robot swarms up to a maximum of 50 robots [58], in this study we scale to 100 robots.

3 Problem Description

A robot swarm is asked to collect objects from n source areas deployed in a 2D environment and transport them to a central depot area. In this study, we ignore the details relative to object picking, deposition, and storage; instead, we focus on the coordinated activity of the robots to move between sources and depot areas. Each source area \(A_i\) (\(i \in \{1,2,\dots ,n\}\)) is an infinite source of one type of object characterised by a quality \(v_i \in [0, 10]\). The objective is to maximise the throughput of objects weighted by their quality.

3.1 Robots

The swarm is completely decentralised and composed of S simple autonomous robots that have minimal knowledge about the environment and limited sensory and memory capabilities. We assume that the robots do not know and cannot keep memory of the number, location, and quality of the source areas. The robots do not communicate with each other or cannot perceive other robots and obstacles in the environment. The robots coordinate and collaborate with each other only through stigmergic communication, i.e. by leaving temporary traces in the environment that can be read by other robots. The robots are equipped with the following sensors and actuators: (i) differential drive motors to move in the 2D environment, (ii) area sensor to detect source and depot areas when the robot is within the area, (iii) object quality sensor to estimate the quality of the collected object, (iv) depot direction sensor to know the relative orientation towards the depot area, (v) pheromone gland to leave in the environment temporary traces (i.e. pheromone), and (vi) pheromone antennae to perceive the presence of pheromone in the robot’s immediate surroundings.

3.2 The Kilobots and ARK

We implemented the robot swarm using Kilobots [50] which are inexpensive simple robots designed to perform large-scale swarm robotic studies. The Kilobot modulates the frequency of its two vibration motors to move on a flat surface. The motors have been automatically calibrated via ARK [47] to move at an average speed of and rotate in place at . The robots have a limited set of sensors and actuators therefore we relied on the ARK system to enhance the Kilobot’s capabilities. The ARK system allows the user to equip the Kilobot with a customised set of virtual sensors and virtual actuators to sense and modify simulated virtual environments shared by all robots in realtime. While the ARK system has global information on the environment and the process, its function is limited to enhance the Kilobots’ abilities, and enrich the experiments for they can be used within; ARK lets the Kilobots operate autonomously in a decentralised fashion without any central control.

Via ARK, we equipped the Kilobots with the required virtual sensors and actuators. The Kilobot periodically receives a message with the relative direction to the depot (coded in 4 bits). When the Kilobot is within an area (either source or depot), ARK informs the robot of the type of area (2 bits) and, if within source, of the object’s quality (4 bits). The Kilobots, through their LED, signal to ARK when they want to deposit a drop of pheromone in their current position. Finally, ARK signals the Kilobot if it has pheromone within detection range (4 bits). The detection range of the virtual pheromone antennae is depicted in Fig. 1(a); the robot can perceive a binary value (presence/absence of pheromone) in four areas 45\({^\circ }\) wide in front of itself at a maximum distance of \({\sim }3.5\) cm.

Fig. 1.
figure 1

(a) The robot can perceive the presence of pheromone in its immediate surrounding through virtual pheromone antennae implemented via ARK [47]. Kilobots sense a binary value (presence/absence of pheromone) in each area. In the depicted example, the cyan shapes show traces of virtual pheromone and the robot’s reading is [1, 0, 1, 0]. (b)–(c) Example of pheromone dynamics as from Eq.(1); at time t, the cell c(ij) has a pheromone value of 100, and at time \(t+dt\) (with \(dt=0.5\) s) the pheromone evaporated at rate \(\epsilon =0.08\) and diffused at rate \(\gamma =0.01\) to the four neighbouring cells.

The virtual environment is updated in realtime by the ARK system that increases pheromone level when a robot deposits a pheromone drop \(\phi =100\) and computes evaporation and diffusion of pheromone over time. The pheromone is stored in a matrix that discretises the 2D environment in 6.7 mm cells (i.e. 150 cells per metre). At each time-step (of length \(dt=0.5\) s), ARK updates each matrix cell c(ij) (with generic indices (ij)) as follows:

$$\begin{aligned} c(i,j) = c(i,j)[1 - (\epsilon + 4\gamma )dt] + \gamma [c(i, j \pm 1)+c(i \pm 1, j)]dt, \end{aligned}$$
(1)

where parameter \(\epsilon =0.08\) is the evaporation rate and \(\gamma =0.01\) the diffusion rate, and \(c(i,j) \ge 0\). Figure 1(b)–(c) show an example of the pheromone dynamics where at time t a drop \(\phi =100\) is deposited at cell c(ij). Equation (1) is a simplification of the exponential decay observed in ant’s pheromone [8, 17].

3.3 Robot Behaviour

The proposed solution has been designed taking inspiration from the foraging behaviour of ants that use pheromone trails to mark the environment. This form of stigmergic communication allows the colony to limit unnecessary independent exploration and to coordinate among peers to collectively exploit the found food resources. The individual behaviour of the Kilobot is implemented as the finite state machine (FSM) of Fig. 2.

Fig. 2.
figure 2

FSM of the individual Kilobot behaviour. The arrows represent transitions between states which are represented as circles.

At the beginning, the robots do not have information about the source location(s) therefore they start searching the environment. Given the Kilobot’s limited capabilities, an easy and efficient method to search an unknown environment is through an isotropic random walk [9] which we implemented with alternate straight motion for 7.5 s and uniformly random rotation in [\(-{\pi }\), \(\pi \)]. Once a source area \(A_i\) has been found, the robot (virtually) collects one object and carries it towards the depot area. On its way towards the depot, the robot deposits drops of pheromone with probability \(P_i\) proportional to the object quality \(v_i\), i.e. \(P_i=v_i/v_{max}\). The Kilobot updates its decision to deposit pheromone every \(\sim \)2 s (which it signals to ARK via its LED), therefore a medium-quality object will lead the robot to lay down intermittent pheromone trails. Once the Kilobot returned to the depot, it unloads the object, turns 180\(^\circ \), and resumes exploration because it cannot store in its memory the source area location. However, through pheromone trails the Kilobot exploits a form of collective memory which is stored in the environment in the form of temporary stigmergic information. In fact, once a Kilobot perceives pheromone in any of the four antennae areas (of Fig. 1(a)), it follows the trail by moving in the direction of the triggered antennae area. If more than one antennae area detect pheromone (as in the example depicted in Fig. 1(a)), the robot selects the area in the most opposite direction from the depot. This selection relies on the assumption that robots only deposit pheromone in their straight path from a source area to the depot and that they have access to the depot vector.

As in every study, we make the experiment code available online; download it at https://github.com/DiODeProject/PheromoneKilobot.

4 Experiments and Results

We measured the system performance through accurate physics-based simulation of the Kilobot swarm. We ran our simulations via ARGoS [41] which is a simulator tailored to swarm robotics needs that allows high speed and accurate simulation of the physics dynamics. ARGoS allows the simulation of the Kilobot robots and the ARK system through its dedicated plugin [40]. Using ARGoS is particularly advantageous because it allows the experimenter to use the same identical code in simulation and on the robots.

4.1 Simulation Scenarios

We investigated the performance of a Kilobot swarm of size \(S \in \{50, 100, 200\}\) in a 2.5 m \(\times \) 2.5 m environment with a central circular depot area and \(n \in \{1,2,4\}\) circular source areas with a radius of 10 cm. The source areas’ positions and object’s qualities were varied to study the quality-distance trade-off. We varied the distance from the depot of the source areas \(d_i \in [0.5,1.5]\) m and object’s qualities \(v_i \in [0,10]\), with \(i\in n\). The robots were initially deployed with (uniformly) random position and orientation within a square 70 cm \(\times \) 70 cm region centred on the depot area. The experiments length was 20 simulated minutes. We report the mean number of objects retrieved from each source and the mean number of robots on each path (computed as the number of robots at a maximum distance of 20 cm from the straight line between depot and source).

4.2 Results for Varying Distance and Quality

We investigated the effects of distance and quality through a scenario with \(n=2\) source areas with diametrically opposed positions. To investigate the effects of distance, we positioned the source \(A_1\) at distance \(d_1=1\) m from the depot and we varied the distance \(d_2 \in [0.5,1.5]\) m of source \(A_2\). Both sources have objects with maximum quality \(v_1=v_2=10\). On the contrary, to investigate the effects of quality, we set the source \(A_1\) with objects of quality \(v_1=5\), and we varied the object’s quality \(v_2\in [0,10]\) of source \(A_2\). Both sources were placed at distance \(d_1=d_2=1\) m from the depot. Figure 3(a), (c) show the number of items collected from each source after 20 min by swarms composed of \(S=\{50,100,200\}\) simulated Kilobots for distance and quality experiments, respectively. The closer, or better-quality, source area always has a larger number of collected object. As expected, source \(A_1\) has approximately constant throughput while the throughput of source \(A_2\) decreases with distance \(d_2\) in Fig. 3(a), and increases with quality \(v_2\) in Fig. 3(c). Figure 3(b) shows the allocation of robots among the two paths. Large swarms (e.g. \(S=200\)) have a redundancy of robots and by moving away the source area, more robots are allocated to it. On the other hand, smaller swarms (e.g. \(S=50\)) reduce the robots on that path as it gets further than 1 m in length. Similarly, Fig. 3(d) shows similar dynamics for the quality experiments. Swarms of \(S=50\) robots reallocate robots to the highest quality, whereas larger swarms of \(S=200\) robots saturate the source paths for qualities \(v_2>3\). Still the collection is directly proportional to the object’s quality (Fig. 3(c)) because (especially at the beginning of the experiment) the pheromone trails are more continuous and easier to follow for areas with better quality objects.

Fig. 3.
figure 3

Results from simulation experiments with \(S=\{50,100,200\}\) Kilobots and two source areas with (a)–(b) equal quality, varying distance, and (c)–(d) equal distance, varying quality. In (a), (c) we report the number of collected objects after 20 min; in both cases, the closest, or better quality, source area has a larger number of collected items. In (b), (d) we report the number of robots on each source path after 20 min. Small swarms allocate resources differently than larger swarms. Lines are mean of 100 simulations and the lighter colour fill is the 95% confidence interval. (Color figure online)

4.3 Effects of the Swarm Size S

Figure 4 shows the number of collected items for varying swarm size \(S \in [10,250]\) in scenarios with \(n=2\) or \(n=4\) sources with equal objects’ qualities \(v_i=10\) and equal distances \(d_i=[1]\) m, with \(i \in n\). On one hand, increasing the swarm size S results in an increasing absolute throughput of objects (blue lines on left y-axis). On the other hand, adding more robots increases the swarm density and causes more collisions. This physical interference among robots reduces the individual robot efficiency (green lines on right y-axis). In fact, Kilobots do not have any collision sensor and, in dense environments, they may lose time pushing each other without moving. A similar trade-off between benefits and costs of adding individuals has been already observed in collective behaviour studies [6, 14, 31].

Fig. 4.
figure 4

Effect of the swarm size S on the number of collected objects. Absolute throughput increases with S but physical interference caused by collisions reduces individual efficiency for increasing S. Lines are mean of 100 simulations and the lighter colour fill is the 95% confidence interval for \(v_i=10, d_i=1\) m, \(n \in \{2,4\}, i \in n\). (Color figure online)

4.4 Quality-Distance Trade-Off

As shown in Sect. 4.2, our system favours nearer over further source areas, and better over worse object qualities. Here, we explore how the system compromises between far, better-quality sources versus nearer, lower-quality sources. We investigate two-sources scenarios with \(A_1\) fixed and varying \(A_2\). Figure 5(a)–(b) have \(d_1=1\)m, \(v_1=10\), \(v_2=5\), and varying \(d_2 \in [0.5,1.5]\)m. We can appreciate that the swarm collects more objects from the lower quality area \(A_2\) than from the better quality \(A_1\) only when the difference in distance is \(d_1-d_2 > 0.2\) m. Figure 5(b) shows the number of robots on each path; small swarms (i.e. \(S=50\)) always allocate more resources (robots) to the better quality option, instead larger swarms (i.e. \(S=200\)) have such a redundancy of resources that robots can fill both paths without need to selectively chose the best. In a similar analysis we fixed the best quality \(A_1\) at \(d_1=1.5\) m, \(v_1=10\) and varied the quality \(v_2 \in [1,10]\) of the closest area \(A_2\) in \(d_2 = 0.75\) m. Figure 5(c) shows that large swarms are minimally influenced by quality variation, whereas smaller swarms select the further and best quality when the closest area has objects of poor quality \(v_2 < 4\).

Fig. 5.
figure 5

Trade-off between closer, lower-quality areas and further, better-quality areas. (a)–(b) Scenario with \(v_1=10, d_1=1\) m, \(v_2=5\) and \(d_2 \in [0.5,1.5]\) m. (c) Scenario with \(v_1=10, d_1=1.5\) m, \(d_2=0.75\) m and \(v_2 \in [1,10]\). When robots are overabundant the trade-off is ignored, whereas smaller swarms prioritise higher quality resources. Lines are mean of 100 simulations and the lighter colour fill is the 95% confidence interval. (Color figure online)

4.5 Kilobot Swarm Demonstrations

The real Kilobot demonstrations are run in scenarios almost identical to the one described in Sect. 4.1 except for the environment size which is 2 m \(\times \) 2 m, and the experiment length which is 30 min or longer. We run four demos D1, D2, D3, and D4. Demos D1 and D2 investigate how 50 Kilobots respond to different qualities and distances, respectively. Demos D3 and D4 show how the system scales with increasing number of sources (i.e. \(n=4\)) and robots (\(S=100\)). Figure 6(a) show an image of D3 where in the closeup the ARK screen visualise the camera stream and the virtual environment information. Figure 6(b) shows a screenshot of D4. The video complete videos are available at http://diode.group.shef.ac.uk/FontLlenas2018.html.

Fig. 6.
figure 6

(a) Image from demo D3 with \(n=4\) source areas. In the closeup, the computer screen shows the ARK’s virtual environment, on the background the Kilobots move between virtual sources following virtual pheromone trails (the virtual environment has been superimposed to the image). (b) Screenshot from demo D4. Full videos are available at http://diode.group.shef.ac.uk/FontLlenas2018.html.

Demo D1 shows 50 Kilobots foraging from two sources placed at the same distance \(d_1=d_2= 0.6\) m with different qualities \(v_1=10\) and \(v_2=5\). In contrast, demo D2 shows 50 Kilobots foraging from two sources with equal quality \(v_1=v_2=10\) but placed at different distances \(d_1=0.6\) m and \(d_2=1\) m. In both cases, the swarm response is similar to the one observed in simulation. A noticeable difference consists in lower number of retrieved object and robots on paths in comparison with the results of Fig. 3. This difference is due to the large actuation noise of the Kilobots that was not included in the noise-free simulations. Additionally, the ARGoS Kilobot plugin [40] used for this study was not yet finely tuned on the real speed/friction of the robot and resulted in largely faster robots. Instead, real Kilobots spent considerable time to resolve collisions between robots moving in opposite directions. Larger commuting time should be balanced by more stable pheromone trails which could be achieved by letting the robot autonomously increase the amount of pheromone deposited in a decentralised fashion.

Demos D3 and D4 showcase the system with \(n=4\) sources and up to 100 Kilobots. The four sources have a set of qualities and distances that allows the viewer to appreciate the quality-distance trade-off investigated in this study. The considered qualities are \(v_1=10\), \(v_2=8\), \(v_3=5\), \(v_4=3\) for both demos, while distance are \(d_1=d_3=0.6\) m, \(d_2=0.8\) m, \(d_4=0.5\) m for D3, and \(d_1=d_2=d_3=d_4=1\) m for D4.

5 Discussion and Conclusion

Foraging is a general task that consists of the two main activities of searching the environment to locate objects and of transporting the objects to a central depot area. This task is widely studied in robotics because it entails activities relevant for several robot applications [61]. Employing multi-robot systems to solve the foraging problem has the clear advantage of offering system parallelisation through concurrent object collection by several robots. At the same time, it introduces the challenge of robot coordination. We successfully tackle the robot swarm coordination through stigmergic communication, although we acknowledge that this is not the only solution as several previous studies explored various alternatives, e.g. [2, 13, 20, 42, 44, 48, 51, 61]. In this study, we assume that robots cannot directly communicate or sense each other; they are totally unaware of the other swarm members, still they cooperate with each other through indirect communication. Pheromone trails allowed memoryless robots to create a form of collective memory; robots stored information in the environment that they used to repeatedly find previously discovered sources. The simplicity of the individual robot behaviour allowed a direct transfer of the noise-free simulation code to the Kilobot experiments and minimised the impact of the reality gap [27].

We successfully implemented the system on swarms of 50 and 100 Kilobots supported by the ARK [47] system which allowed robots to operate in a virtual environment. This work exploits the full potential of the ARK infrastructure and showcases ARK’s unique functionalities. Even though the robot experiments included virtual components, the physical implementation of the system has been useful to display the solution robustness and validate the simulation results. While we acknowledge that hybrid experiments (in between reality and simulation) do not correspond to real-world applications, we still believe they represent useful test-beds to validate and demonstrate theories within research labs.

This study included the source quality as a factor influencing the foraging behaviour, this may relate to the priority to fetch each type of object. Further work should better investigate how to control the balance between quality and distance. This investigation could relate to optimal foraging theory [30, 46] which considers the net energy intake as the energy gain discounted by the foraging cost. This type of ‘economical’ analysis of the foraging behaviour allows determination of the best theoretical foraging strategy as a function of various components, such as food-distance, prey-payload, and food-quality. Optimal foraging theory has been applied to predict a large variety of foraging behaviour including the central place foraging [37, 38, 52] investigated here. We acknowledge that previous work has employed optimal foraging theories to engineer multi-robot systems [4, 43, 57] and we believe that this research line should be continued.

Our results are in-line with previous investigations and show that system performance is dependent to the strength of the positive feedback, the swarm size, and discoverability of sources. Robots are in control of only the first factor and it might be useful to identify if the robot could prioritise quality or distance by modulating the positive feedback strength (i.e. pheromone drop size and deposition frequency as a function of source’s quality and discoverability). Additionally, in our study, the robots do not perceive differences in pheromone concentrations, in contrast, ants have a nonlinear response to pheromone that can result in a collective decisions in favor of one food source over another [35]. We hypothesise that the swarm could achieve a similar selective allocation of all resources to the best available source by exploiting a negative feedback in the form of repellent pheromone [7, 49].