1 Introduction

While operating in a hostile place, adversarial environment, or a remote location, communications between agents and command personnel may become weak, noisy, and time-lagged. This makes it difficult to communicate instructions to the field. Autonomous drones can be deployed to the field to achieve objectives such as protecting a target and establishing an efficient route between two points. The purpose of this study is to provide a pathway for creating a swarm of autonomous agents capable of finding targets and establishing networks between them using a multi-agent swarm model of the true slime mold Physarum polycephalum. Herein we outline simple algorithms for both generating a gradient field of global information for the swarm, and individual agent kinetics based on the global information. Similar to previous approaches we utilize a mathematical model inspired by chemotaxis to move agents to and from high gradient areas. However, unlike previous approaches towards realizing a synthetic slime mold in the form of an autonomous swarm, we have divided the computation from purely centralized or distributed to centralized processing of global information distributed to each agent for individual decision-making and kinetics. Given sufficient localized sensing apparatus, the centralized processing could be dispelled and the behavior of the slime mold would be realized by relying solely upon the individual agent kinetics algorithms.

H. Yokoi et al. [1] proposed a mathematical model simulating the behavior of a slime mold using a potential field constructed by the accumulation of various actuators and sensors that is propagated using vibrational diffusion. This model, realized as a simulation, was capable of obstacle avoidance and presumed target acquisition; however, the use of waves to propagate information through the potential field creates issues of resonance and interference. In addition, as compared to our use of Jones’ model [2], we require much less information to construct a gradient field, and, by using a simple diffusion mechanism, the field is not susceptible to issues arising from interactions with boundaries. Yang et al. put forth an autonomous target searching algorithm by which agents are assigned a discovery cell using a Voronoi diagram. Once an agent has found a target within its cell a synthetic chemical signal is placed in the environment to attract surrounding agents and enclose the target [3]. As outlined by Trianni et al. one of the main issues facing an autonomous swarm is the simple instructions provided to the agents to realize self-organization and intelligence [4], we believe that Jones’ model of Physarum polycephalum addresses this issue and provides a mechanism of self-organization capable of achieving the behavior of the slime mold. We report a protocol for a real-world drone swarm to discover targets and establish an efficient route between the targets.

This approach is characterized by its combination of an artificial potential field (APF) and an individual agent sensing apparatus. An APF is used to simulate chemical markers in an environment that organisms such as bacteria, insects, and slime molds react to for locomotion through chemotaxis. The individual agent sensing apparatus defines the behavior by which an agent will react to the APF and leads to the emergent behavior of a coherent swarm. The APF can be used to both create areas of repulsion (negative chemical marker value) and attraction (high chemical marker value), while also allowing agents to “communicate” with each other. This is not communication through direct signals, rather when an agent moves it leaves behind a chemical marker that other agents react to. Although the use of an APF is beneficial for emergent swarm behavior, it also poses the risk of generating local maxima such as the death spiral seen in ants as will be discussed later.

Fig. 1
figure 1

Demonstration of path finding and network establishing behavior of Physarum polycephalum. Nakagaki et al placed food sources at the each metro station on a miniature map of Tokyo. After allowing the mold to grow for 26 hours a full network was developed, efficiently connecting each metro station [5]

The slime mold is a simple organism whose behavior can be utilized for many applications such as maze solving [6], the formation of logic gates [7], and network development [8]. An example of this behavior can be seen in Fig. 1, where Nakagaki et al. positioned food at each of Tokyo’s metro stations on a miniature version of the city. Next, they placed a sample of Physarum polycephalum in the dish and allowed it to grow [5]. The network developed by the mold was quite similar to Japan’s existing railways which were designed through rigorous mathematical optimization. We demonstrate the use of a Physarum polycephalum inspired multi-agent swarm protocol for locating targets, aggregating around targets, and establishing energy-efficient networks between targets. From the perspective of a multi-agent swarm, it is important to prioritize the processing power required by individual agents. Specifically, the necessary processing required by individual agents must be minimized to reduce payload and latency within the swarm. To increase resistance against attacks to the swarm, agent-to-agent communication should be limited and the agents should be able to react to their own stimuli. Therefore, a protocol based on simple stimuli is promising for a multi-agent swarm. We outline the important factors of Jones’ slime mold model, the infrastructure and the algorithms of this model for use within a multi-agent swarm of ground agents [2][9][10]. We also provide an exploration into the effectiveness of a slime mold inspired swarm and prospective future work towards an effective and deployable swarm exhibiting the behavior of the slime mold Physarum polycephalum.

2 Mold Swarm model

Fig. 2
figure 2

Evolution of an 8000 agent holonomic “Mold Swarm”. Fifteen food sources (red) are randomly placed in the arena. The six snapshots are taken at iteration count (left to right): 0, 50, 100, 500, 1000, 1250

Early research into the behavior of Physarum polycephalum utilized oscillators to emulate the generation and interactions between plasmodia [9]. Recent models, however, attempt to capture the overall behavior of the single-celled organism and leverage the distributed intelligence of the mold [8][9]. Highlighting this recent work, Dr. Jeff Jones from the Centre of Unconventional Computing has outlined a model of the behavior of Physarum polycephalum that is the basis of this “Mold Swarm” implementation [2]. In an effort to exploit the mold’s behavior for target discovery and network development, Jones’ model utilizes a multi-agent swarm approach in which each agent is representative of a plasmodium of the overall slime mold. The plasmodium, or individual agent, reacts to a chemoattractant field (APF) and moves via a unicycle kinetics model. When there is a sufficient density of plasmodia the swarm exhibits self-organized formation and collective transport networks. Important features of the model include an APF and individual plasmodium sensing [7]. It is important to note that this model, like many others, is holonomic, meaning that the plasmodia are capable of instantaneously rotating toward any direction. Unlike other models that utilize an APF, Jones’ model allows for a two-way interaction between APF-agent and agent-APF. Specifically, the agents respond to the APF and deposit chemoattractant when they move and discover food sources. This two-way interaction is further displayed in the following flow chart:

  1. 1)

    An artificial potential field is generated with preassigned food sources represented by high a concentration of chemoattractant that is diffused through the field.

  2. 2)

    Individual agent sensing is employed for agents to move towards higher concentrations of chemoattractants within the APF.

  3. 3)

    Upon movement of an agent, some chemoattractant is placed in the APF at its location.

  4. 4)

    On the discovery of a food source, agents either consume or place a higher concentration of chemoattractant to further attract other agents.

  5. 5)

    The APF is updated and sent out to the agents once more in a new state reflecting the evolution of the environment.

  6. 6)

    Return to part 2.

To further display the two-way interaction, we recreated this model and display an evolution of an 8000 agent swarm with 15 food sources, shown in Fig. 2. For a full description and analysis of Jones’ model, consult “Mechanisms Inducing Parallel Computation in a model of Physarum polycephalum Transport Networks” [11].

2.1 Artificial potential field

The slime mold Physarum polycephalum utilizes chemotaxis, which is the movement of an organism in response to chemical stimuli [12]. This is an important function found in nature for finding food or avoiding toxins within the environment for many organisms such as ants and bacteria. The model used in this study only focuses on positive chemotaxis, which is the movement of agents toward a higher concentration of the chemical in question, for our purposes, this will be food sources. Although negative chemotaxis is beneficial for object avoidance problems such as maze solving, we focus on the target-finding nature of the slime mold. The APF (see Fig. 3) is constructed by increasing the chemoattractant value at the locations of each food source and plasmodium (agents). The chemoattractant value of the food source (fVal) and plasmodium (pVal) are both parameters that can be adjusted to tweak the behavior of the swarm. After the initialization of the APF, the values are diffused. Diffusion is governed by the parameter diff, after each step the field is diffused to simulate the random diffusion of chemicals within the environment.

Fig. 3
figure 3

Surface plot with heat map of the chemoattractant field (APF). Each peak represents a food source and the smaller hills represent clusters of plasmodium

Fig. 4
figure 4

Depiction of the sensing apparatus for each plasmodium (agent) used in both the holonomic simulation and non-holonomic protocol. Virtual plasmodium centered on the green square with chemical sensors at FL, FR, F offset by SO and spaced by the angle SA [9]

An important factor in the evolution of the chemoattractant field is the behavior of the plasmodium, as mentioned earlier as the two-way interaction between agents and the APF. After a plasmodium has orientated itself toward the highest concentration of the chemical, it moves forward and deposits pVal within the lattice. This influences plasmodium to be attracted to each other, and subsequently form into large groups centered around food sources which deposit a much larger fVal after each step in the simulation.

2.2 Sensing

To navigate the chemoattractant field each plasmodium has three sensors offset before them. After movement, each agent accesses the chemoattractant field and measures the value at each FL, F, and FR location to determine the highest reading. The agent then rotates towards this direction or continues facing forward. Shown in Fig. 4 and Table 1 is a diagram of the synthetic sensors and parameters that dictate how the plasmodium behaves. By changing the sensor offset (SO) and sensor angle (SA), the behavior of the agents can be adjusted. Specifically, it is observed that a higher SO value will increase the foraging behavior of the agents while a smaller SO will increase the grouping behavior and decrease the foraging behavior. The SA can be adjusted, but a value of \(90^\circ\) is most effective, as discussed later. Smaller SA values can lead to agents moving in straight paths and will often move without detecting a food source or grouping.

Table 1 Parameters for sensing

3 Swarm infrastructure

Fig. 5
figure 5

Anki Vector version 1.0, the successor to Anki’s Cosmo line of robots. The Anki Vector is used as the ground-based agent for the non-holonomic protocol based on Jones’ Physarum polycephalum model. We leverage the use of the Anki’s distance sensor and differential drive

To achieve the translation of a drone swarm based on a holonomic mathematical model to a real-world swarm of non-holonomic robots, three mechanisms must be present: the robotic agents of the swarm, a communication protocol whether that is the command to agent or agent-to-agent, and a sensing apparatus to update the knowledge of an ever-changing arena. The choice of agent is extremely important in being compatible with the assumptions of the underlying model while maintaining cost-effectiveness. The underlying assumption that the agents must adhere to is unicycle kinetics, meaning that the agent has two mechanisms of movement: z-rotation (top-down), and x-translation (forward-facing). Next, there must be a method of communication by which agents can continuously receive the APF used for sensing and determining locomotion. Lastly, there must be a sensing apparatus capable of acquiring the pose data of each agent in the swarm. Careful consideration of these three mechanisms is important in reducing the error associated with translating Jones’ model from a holonomic simulation space to a non-holonomic real-world/simulation space.

Fig. 6
figure 6

(LEFT): Overhead camera running April Tag detection software to output the pose of each Vector. (RIGHT): Diagonal view of test arena including charging stations and view of Anki vectors

3.1 Agents

To emulate the effective target-finding behavior of Jones’ model, it was determined that ground-based agents were most readily capable of performing with the model. Although this work could be expanded to three dimensions, we focus on the problem of target finding within a 2D space. As mentioned earlier, the agents must employ unicycle kinetics in accordance with the assumptions of the model. Thus, we employ Anki Vectors, shown in Fig. 5, as they are easily controlled through WiFi communication, utilize unicycle kinetics, and have multiple sensors. The sensors featured on the Vector include but are not limited to four cliff sensors, a forward-facing camera, and a forward-facing proximity sensor. The proximity sensor negates the requirement of each vector having to know the location of every other agent for collision avoidance. To program the Vectors, Anki has created an SDK that can be utilized over WiFi to communicate with each vector. Although the Anki Vectors have many useful sensors, their programmed AI can assume control of the robots if they either lose connection, fall on their back, or get stuck in a loop which may disrupt swarm behavior.

Table 2 ROS2 node list

3.2 Communication

In the context of a drone swarm, ground truth must be established to react accurately to the evolving environment. For a distributed swarm a consensus mechanism is important for keeping coherence in the actions of the individual agents so as not to disrupt the desired emergent behavior. In this implementation, we are more concerned with validating the use of Jones’ model for use in real-world non-holonomic agents. Thus to avoid consensus issues, a central hub of communication is employed for distributing ground truth data in the form of an APF. Since the agents have WiFi capabilities, ROS2 (Robot Operating System version 2) is used as the communications protocol. ROS2 is specifically designed with many robots and sensors in mind, thus it is language-agnostic, data published on the network can be read by all agents. It is general, meaning that each device that creates a node on the network can be replaced with a new device carrying the same node. The ROS2 network is composed of nodes that have the ability to publish data to topics or subscribe to multiple topics to read its data. Topics represent specific data within the network, whether it is an agent’s location or the proximity sensor reading from an agent. The main nodes and topics that they subscribe to or publish to are featured in Table 2.

3.3 Location data

To achieve a ground truth to be used in the updating of the APF, an overhead camera is used. The overhead camera determines the pose data of each agent through the use of April Tag recognition software created by April Robotics. Specifically, each agent has an April Tag placed on top of them, that is orientated to be parallel to the agent’s x-axis (forward-facing axis) as seen in Fig. 6. To establish the ground truth, the pose data of each agent is collected and normalized to the origin (represented as another April Tag) to maintain cohesion in the event of the camera or arena being perturbed. Note: having an April Tag as the origin allows for the use of multiple overhead cameras with an origin for each to stitch together a larger arena.

4 Mold swarm algorithms

The swarm protocol developed in this study focuses on two main programs: Mold Overlord and Agent Controller. These two programs encompass the desired behavior established by Jones’ model of the slime mold Physarum polycephalum’s target finding and network development abilities. The command infrastructure of the swarm is split into two portions: the communication hub and the individual agents. Specifically, the communication hub is responsible for gathering the ground truth data, compiling an updated APF, and transmitting the updated APF to every agent through the ROS2 network. To achieve a decentralized swarm each agent has a dedicated computing module for their sensing and instance of Agent Controller. A benefit of providing each agent with a dedicated computing module is the increased resilience of the swarm. If a few modules fail, the majority of the swarm will still be operational. Note: References to the Anki Vector SDK and April Tag recognition program can be found on the Anki SDK and April Robotics GitHub respectively.

4.1 Mold overlord

Mold Overlord was designed such that it would do the minimal computing required for the swarm to function. The end goal of many swarm models is full distributed autonomy; however, there is not currently a method of attaining a ground truth consensus in accordance with Jones’ model. Thus, the Mold Overlord is responsible for establishing the ground truth generated by the overhead camera and producing the APF that the individual agents react to. A publisher is created to publish the APF to the ROS2 network in the form of a matrix using ROS2’s image message type.

figure a

The APF matrix is converted to an image message using cvBridge and its cv2.imgmsg method. Next, a subscription is made to the “Agent List” topic, which is an array containing the ID of every agent on the network. After the agent list has been received by the Mold Overlord node (Algorithm 1), the program creates a subscription for every agent and stores this in an array to be later accessed (the ID of the agent is also the index of its subscription in the array). To update the locations of the agents, spin_once is called for every agent, so that the pose of each will be updated prior to updating the APF. Since there are multiple subscriptions on the agentNode, spin_once cycles through every agent in the subscription array before returning to the original. When spin_once is called, the position and orientation of the current agent in the loop are pulled from the network in the form of a Twist message and saved in a pose data matrix.

figure b
figure c
figure d

After the agent pose data has been updated, the food sources and agents are placed within the APF using Algorithm 2. Specifically, wherever an agent is found, pVal is stored in that location within the APF. The same procedure is conducted for the food sources (fVal). After the agents and food sources have been added to the APF, Algorithm 3 is called to simulate natural chemical diffusion in the environment. Lastly, the new field is published to the network.

4.2 Agent controller

Agent Controller (Algorithm 4) was designed such that each agent would emulate the behavior of the plasmodium from Jones’ model, meaning that the controller utilizes the same sensing as Jones’ model and kinetics to mimic the behavior of a holonomic system on non-holonomic agents. This algorithm predominately dictates the behavior of the swarm. To clarify, every agent in the swarm has a dedicated instance of agent Controller. For a given agent in the swarm, it first creates a subscription to the APF, the global tracker, and its own proximity sensor. Next, it initializes a publisher to send command velocities back to its agent’s motors.

figure e

After all preliminary tasks are completed, the agent controller pulls its agent’s location, proximity sensor reading, and the newest APF from the ROS2 network. Then the controller begins its sensing protocol as outlined in Jones’ model (Algorithm 5). Before the Agent Controller sends the command velocities to its agent, a collision avoidance check is completed (Algorithm 6). Agent Controller receives the proximity sensor reading of its agent and compares it to a preset safety distance. If a collision is impending, a velocity command is sent to orientate the agent in a new random direction. Otherwise, the command velocities are sent to its agent based on the sensing algorithm.

figure f

5 Simulations and observations

Fig. 7
figure 7

Snapshot of CoppeliaSim with 128 agents placed in the arena

Table 3 Parameters for simulation
Fig. 8
figure 8

Alternative view of the APF during a trial with 4 food sources and 32 agents. Each agent resembles a comet with a tail of its deposited chemoattractant being diffused into the environment

Fig. 9
figure 9

Graphs show the grouping (Left) and target finding (Right) behavior of the simulation and protocol

Fig. 10
figure 10

Comparison of angular (Left), linear (Middle), and normal velocities (Right) overtime between the holonomic and non-holonomic swarms. To measure the entropy of the system, the individual velocities of each agent is recorded each iteration and then averaged over the swarm

To assess the viability of the proposed swarming algorithms, a model of the agents (Anki Vectors) was created in a robotic agent simulation environment: CoppeliaSim (Fig. 7). Swarm populations of 4, 8, 16, 32, and 48 agents were tested. Note: Due to complications with CoppeliaSim, populations greater than 50 agents were unable to run, as a number of agents would become unresponsive. The parameters of the swarm trials of 4, 8, and 16 agents can be seen in Table 3. As well as testing the swarm behavior with varying population, the sensor offset (SO) and sensor angle (SA) were varied for the 32 and 48 agent populations. Specifically, three sets of parameters were tested with each population:

  1. 1.

    SO = 0.15m, SA = 45\(^o\)

  2. 2.

    SO = 0.15m, SA = 90\(^o\)

  3. 3.

    SO = 0.10m, SA = 90\(^o\)

The APF used in the trials featured four food sources placed in the arena, each equidistant from the center and walls, forming a cross as seen in Fig. 8 by the large white circles. To determine the performance of the non-holonomic swarm algorithms, five data points were compared between the holonomic simulation of 8000 agents and the non-holonomic trials:

  1. 1.

    Average nearest-neighbor distance over time: Fig. 9 Left.

  2. 2.

    Average distance to nearest target (food source) over time: Fig. 9 Right.

  3. 3.

    Entropy of States: Average angular velocity over time: Fig. 10 Left.

  4. 4.

    Entropy of States: Average linear velocity over time: Fig. 10 Middle.

  5. 5.

    Entropy of States: Average speed over time: Fig. 10 Right.

These five data points track the performance of the swarm for the following reasons. The average nearest-neighbor distance signifies the grouping of the swarm over time, such that an undeveloped swarm of Physarum polycephalum is distributed while a developed swarm of agents is closely packed around each other and the targets. The average nearest target distance signifies the target-finding capability of the swarm. As observed in the holonomic simulations of 8000 agents, as the swarm evolves, the velocities of each agent begin to converge. Specifically, the angular velocities increase (as agents get closer to targets, they increase rotation to circle the targets), the linear velocities decrease (at the beginning agents are far from targets, as they get closer they no longer need to move strictly linearly), and the overall speed decreases as the swarm approaches a steady state.

At four, eight, sixteen, and thirty-two agents, there is not much emergent behavior besides being attracted to the food sources and an important behavior that can be seen in ant colonies: the death spiral. The death spiral is observed when a group of ants is separated from the main foraging party and begin following each other’s pheromones forming a continuous spiral [13]. This behavior can be seen by just one or multiple agent(s) that are not close enough to a food source. An example of this can also be seen in Fig. 8 by the groups in the middle and in the upper left. This is a typical observation of models that utilize an APF and chemotaxis.

At a population of forty-eight agents, a movement begins to be seen between the target locations. This behavior is reflected in Fig. 9 (right) by the shallow slope of the N = 48 curve. Although most of the agents are bounded in the region between food sources, there are still groups that are off to the sides in death spirals. This indicates that the density of agents has not been achieved in order to see the full behavior of a true Mold Swarm.

After exploring the parameter space of the swarm, it was determined that a sensor offset of 0.15 m and a sensor angle of \(90^\circ\) performed the best. In solving the problem of autonomous target finding in a 2D space, the parameters which yield the best target-finding abilities are thus the best. As seen in Fig. 9, a sensor offset of 0.15 m and a sensor angle of \(90^\circ\) outperformed the other parameter sets for both populations (\(\hbox {N}=36, \hbox {N}=48\)). Therefore, the Entropy of States analysis (Fig. 10) focuses solely on the performance of swarms with these parameters.

The entropy of states (Fig. 10) is further used to compare the behavior of the non-holonomic and holonomic swarms. To measure the entropy of the system, the individual velocities of each agent are recorded for each iteration and then averaged over the entire swarm. As observed in the case of the holonomic swarm, the average velocities approach an asymptote representative of steady-state behavior.

With regard to the holonomic simulation, each of the graphs in Fig. 10 approaches a steady state. This suggests that within 1500 iterations, the swarm is fully established and the velocities are steady. The angular velocity first decreases as each agent is sufficiently separated, so agents move in a straight path towards their nearest neighbor or target. As soon as agents begin to form groups, average rotation velocity increases representing collision avoidance and movement between food sources. The linear velocity shows that as the swarm becomes more evolved there is less movement away from the swarm itself, and the majority of linear movement is to transition between food sources.

The behavior of the non-holonomic model is more complex due to its differing kinetics. While the holonomic simulation has each agent sense and then rotate towards its next movement, the non-holonomic simulation is constantly moving except to avoid an impending collision. Thus, each agent creates an arc while moving rather than a series of straight lines. This constant linear movement is reflected in Fig. 10 (Center) where the average hovers at a pseudo-steady state. As seen in Fig. 10, unlike the holonomic simulation the non-holonomic swarm is already at a steady state concerning velocities. This is what would be expected of a real-world swarm; specifically, agents will always be moving. As a result of the non-holonomic nature of this swarm, to achieve the target location the agents must move linearly to turn. Agents are unable to turn in place on top of a food source as is observed in the holonomic simulation. Overall, these three graphs depict the difference in dynamics between the holonomic simulation and the non-holonomic protocol.

The utility of the graphs seen in Fig. 9 is to show the grouping (nearest neighbor) and target finding (nearest target) behavior of both the simulation and protocol. As seen in Fig. 9 (Left), in both implementations the average distance between agents decreases over time and approaches a steady-state. This shows that both the simulation and protocol are capable of forming a swarm. Returning to population density, the oscillations in the data for the 32 agents are believed to be the result of agents both getting trapped in death spirals and “losing scent” of the food source. If the agent loses track of the food source it would likely move in a straight path until reaching a wall, when it will turn around and find another food source or group. As well, as the population of the swarm increases, the steady state nearest-neighbor distance decreases, suggesting further emergent swarm behavior. Regarding the nearest target graph, it is expected that both simulations’ distance to the nearest target will decrease until a steady state is reached; however, this is only displayed in one of the parameter pairs. This reflects the importance that the individual agent parameters have over the emergent swarm behavior. In addition, the pVal and diff parameters may be tweaked in favor of either solely target finding or grouping behaviors. With the right parameters, as previously discussed, the desired target finding is exhibited and swarming behavior is seen.

6 Conclusion

We demonstrated the potential of a slime mold inspired multi-agent swarm for use in target finding and congregation. The model used is capable of finding food sources and establishing an efficient route between multiple sources. However, the protocol has not been tested fully on a swarm of non-simulated agents, holonomic or not. As well, the parameter space could be further explored concerning the APF, in furthering the translation from a holonomic model to a set of real-world non-holonomic agents.

Future work should be focused on expanding the swarm protocol to three degrees of freedom and be tested on a swarm of quad-copters, as this would greatly expand its possible areas of deployment. As well, to achieve a further distributed swarm, alternatives to a central processing hub for the APF should be explored, including light communication and machine vision [14]. The global tracker could be replaced with light tracking, where agents and targets emit light of a different wavelength that can be used for precise three-dimensional locating [15]. In addition, machine vision-based sensing could be incorporated with the use of the Anki Vector’s camera to determine the density of groupings. Physarum polycephalum remains one of the most promising biological examples of swarming for network development, target finding, and path optimization.