Keywords

1 Introduction

Robotics stands now as a convenient tool to study the animal behaviour. In recent ethological and animal behavioural studies, robots are used to induce specific and controlled stimuli and assess the response of the animals under scrutiny. This allows to test various hypothesises on the nature of the signals used by the animals for social interactions [16, 21].

Autonomous robots interacting in real-time with animals [19] makes it possible to create social interactions between both of them. This has already been demonstrated by several authors for studying the behaviours of sheepdogs [26], cows [12] or drosophila [27] to cite a few. In this paper, we focus on zebrafish (Danio rerio), and we describe a biomimetic model that can be implemented in a robotic lure and validated its acceptance by four zebrafish in a structured environment.

The main difficulty is to make the robotic lure behave in such a way that it is accepted by the animals as social companion, just as any other interacting fish would be. Beyond the scope of this paper, this is a first step to enable the modulation (though action) of the collective behaviours of the observed zebrafish [15].

Different approaches have been proposed to control the movement of fish-lures [9]. Most of them do not involve a closed loop of social interaction with the fish. This is often the case for lures fixed to a robotic arm that performs repeated movements, but also for studies with autonomous fish-lures. Closing the loop of social interactions requires a real-time tracking, or perception, of the agents (fish and robot), and a decision-making algorithm to control the robot behaviours. In most of the experiments reported in the literature, the robots driven with closed-loop control are programmed to follow the centroid of the fish group, to ensures that the robot will join and follow the group of fish. However, this type of controller implies that the robot is more a passive follower than a real group-member making its own decisions. The embodiment of bio-inspired models can lead to a better social integration of the artificial agents in animal groups and can allow the robots to influence the collective decision of the mixed group by giving specific preferences to the robot by tuning parameter values of the model [9, 15].

We present a method to calibrate automatically a new behavioural zebrafish model by evolutionary parameters optimisation. This multilevel model describes collective behaviour in a structured environment in agreement with experimental observations. This model makes important extensions to our previous model for collective behaviour in a homogeneous environment [10]. The model takes into account a simple structured environment composed of two rooms and the fact that the fish adapt their behaviour to the zones where they are while performing collective behaviour. For such multilevel and spatially dependent social behaviour model it is an issue to calibrate the model because it involves trade-offs between social tendencies (aggregation, group formation), and response to the environment (wall-following, zone occupation). We use an evolutionary algorithm (NSGA-II [13]) to optimise the parameters of this model so that the exhibited collective dynamics correspond to those observed in biological experiments. Then, we validate experimentally this model by implementing it as the controller of robots that are integrated in small fish groups.

2 Materials and Methods

2.1 Experimental Set-Up

We use the experimental set-up described in [3, 9, 11, 24], with the arena presented in [9, 24]. This set-up (Fig. 1A) consists of a white plexiglass arena (Fig. 1C) of \(1000\times 1000\times 100\) mm, that is composed of two rooms linked by a corridor. To validate experimentally our calibrated model, we use a robot developed by the EPFL [2,3,4,5] for the ASSISI project [23]. This robot is powered by two conductive plates under the aquarium. An overhead camera captures frames that are then processed for tracking and control purposes (see Fig. 1A).

All trials have a duration of 15 min. We tracked the positions of the agents by using the idTracker software [22]. Using this software, we obtain the positions P(xyt) of all agents at each time step \(\varDelta t = 1/15\) s for all experiments, and build the trajectories of each agent. The experiments performed in this study were conducted under the authorisation of the Buffon Ethical Committee (registered to the French National Ethical Committee for Animal Experiments #40) after submission to the French state ethical board for animal experiments.

Fig. 1.
figure 1

Panel A: Experimental set-up used during the experiments [2,3,4,5, 9]. Panel B: FishBot [2, 4, 5]: the robot used for mimicking fish motion patterns, with the biomimetic lure used during the reference experiments. This robot was developed by the EPFL for the ASSISI project [23]. Panel C: Experimental arena composed of a tank containing two square rooms (\(350\times 350\) mm at floor level) connected by a corridor (\(380\times 100\) mm at floor level). The fish tend to swim from one room to the other, either in small groups, or individually. This set-up is used to study the zebrafish collective dynamics. Panel D: Positions of the three different zones corresponding to different types of behaviours: in the corridor (zone 1), in the center of each room (zone 2), and near of the walls of each room (zone 3).

2.2 Behavioural Model

Most of the fish collective behaviour models do not take into account the environment i.e. the walls or the structure of the tanks because they only focus on the social interactions [18, 25].

However, zebrafish show context-dependent behaviours when they are in a structured environment. Depending on their spatial position in the environment they adapt their individual behavioural pattern. Moreover, because they are a gregarious species they also take into account the position and the behaviours of the other fish and can aggregate or start collective behaviours. As many animal species, zebrafish display strong thigmotactism and follow walls or edges. We show that they adapt their behaviour in three different zones of the structured set-up: first the zone when they are close to the walls, second the zone when they are in the centre of the rooms and third when they use the corridor to change room. We take into account this spatial and context-dependent behaviours.

Each zone corresponds to a behavioural attractor. When the individuals are in one of the three zones they adapt their behaviour and perform specific behavioural patterns. In the zone near the walls they perform mainly thigmotactism (wall following), in the centre of the room they explore, in the corridor they transit from one room to the other. At the same time they also take into account the behaviour of the other fish as they also do collective behaviour such as collective departures from the rooms. The other fish can be in any of the other zones and thus can also induce behavioural attractor switching of their companions.

We extend the biomimetic hybrid model [9, 10] using microscopic and macroscopic information [7, 8]. This new model (described in Fig. 2) takes into account zones that correspond to different behavioural attractors and thus allows context-dependent behaviours. The individual can switch from one behavioural attractor to the other and at the same time perform collective behaviour. Our model describes individual choices close to action selection and collective behaviours at the same time. It is a step towards modelling action selection in the context of collective behaviours.

Fig. 2.
figure 2

Multilevel model used to describe fish behaviour. The agents display different behavioural attractors depending on the zone where they are situated. Thus, according to the agent spatial position, the physical features of the zone drive them towards a specific behavioural attractor. A behavioural attractor corresponds to a set of behavioural patterns adapted to the zone where they are located. It can correspond to different parameters sets for the same behaviour kind.

Fig. 3.
figure 3

Panel A Computation of the PDFs functions used by the model. One function corresponds to the focal fish; another corresponds to the perceived neighbouring agents. The final PDF is a weighted sum of these functions, with a normalisation factor \(\gamma _{z_1,z_2}\) corresponding to the affinity between the zones \(z_1\) (origin) and \(z_2\) (destination). The direction taken by an agent is drawn randomly from the resulting PDF by inverse transform sampling. Panel B Table of model parameters for each agent. The zone \(z_i\) corresponds to the zone where the agent is situated at time t, and \(z_j\) to the zone where the agent would be at time \(t+1\). The linear speed distributions of the agents are the same as the ones observed in the Control experiments, and they are not optimised. The other parameters in the table are optimised.

We present a multi-level and multi-agent biomimetic model, inspired from [9, 10] that describes the individual and collective behaviours of fish. As in [10], this model makes the link between fish visual perception (of congeners and walls) and motor response (i.e.: trajectories of the agents). However, it is also capable of expressing a variability in agents behaviours when they occupy specific zones of the arena (behavioural attractors). Figure 3B lists the model parameters.

In this model, the agents update their position vector \({X_i}\) with a velocity vector \({V_i}\):

$$\begin{aligned} X_i(t+\delta t)= X_i(t) + V_i(t)\delta t \end{aligned}$$
(1)
$$\begin{aligned} V_i(t+\delta t)= v_i (t+\delta t) \varTheta _i (t+\delta t) \end{aligned}$$
(2)

The model computes a circular probability distribution function (PDF) [10] corresponding to the probability of the agent to move in a specific direction (\(\varTheta _i\)). This PDF is as a mixture of von Mises distributions, an equivalent to the Gaussian distribution in circular probability. The computation of this PDF involves the calculation of two other PDF functions: the first one describing agent behaviour when no stimuli is present, and the second one characterising agent behaviour when conspecifics are perceived by the agent.

The PDF capturing agent behaviour when no stimuli is present is given by:

$$\begin{aligned} f_{0,z_j}(\theta )=\frac{\text {exp}(\kappa _{0,z_j} \text {cos}(\theta ))}{2\pi I_0(\kappa _{0,z_j})} \end{aligned}$$
(3)

for an agent situated in zone \(z_j\), and with \(I_0\) the modified Bessel function of first kind of order zero. When the agent is situated in a zone close to a wall (zones 1 and 2 of Fig. 1D), we implement a wall-following behaviour, by increasing the probabilities of moving towards either side of the closest wall. This is achieved by using the following PDF:

$$\begin{aligned} f_{0,z_j,w}(\theta )=\frac{1}{2} \sum _{k=1}^{2} \frac{\text {exp}(\kappa _{0,z_j} \text {cos}(\theta - \mu _{w_k} ))}{2\pi I_0(\kappa _{0,z_j})} \end{aligned}$$
(4)

with \(\mu _{w_k}\) the two possible directions along the considered wall.

Examples of agents trajectories are found in Fig. 5B. The probability of the focal fish to orient towards a perceived fish is given by a von Mises distribution clustered around the fish position:

$$\begin{aligned} f_{F,z_j}(\theta )=\sum _{i=1}^n \frac{A_{f_i}}{A_{T_f}} \frac{\text {exp}(\kappa _{f,z_j} \text {cos}(\theta - \mu _{f_i}))}{2\pi I_0(\kappa _{f,z_j})} \end{aligned}$$
(5)

with \(\mu _{f_i}\) the direction towards the perceived agent, \(A_{f_T} = \sum _{i=1}^{n_f} A_{f_i}\) the sum of the solid angles \(A_{f_i}\) captured by each agent and \(n_f\) the number of perceived agents.

The final PDF \(f(\theta )\) is computed as follow:

$$\begin{aligned} f_{z_j, z_k}(\theta ) = \gamma _{z_j,z_k} \frac{f_{0,z_j}(\theta ) + \alpha _{z_j} A_{T_f} f_{F,z_j}(\theta )}{1 + \alpha _{z_j} A_{T_f}} \end{aligned}$$
(6)

The parameter \(\gamma _{z_1, z_2}\), used as a multiplicative term of the final PDF, modulates the attraction of agents towards target zones. Figure 3A describes how the final PDF is computed and how it is used to determine the agents next positions.

Unreachable areas of the PDF (e.g. the walls) are attributed a probability of 0. Then, we numerically compute the cumulative distribution function (CDF) corresponding to this custom PDF \(f(\theta )\) by performing a cumulative trapezoidal numerical integration of the PDF in the interval \([-\pi ,\pi ]\). Finally, the model draws a random direction \(\varTheta _i\) in this distribution by inverse transform sampling. The position of the fish is then updated according to this direction and his velocity with Eqs. 1 and 2.

3 Results

We consider four cases. We define the Control results as obtained from biological experiments with five zebrafish in the experimental set-up described in Sect. 2.1. The Sim-MonoObj and Sim-MultiObj results are defined to correspond to the model in simulation with five agents, calibrated respectively using mono-objective or multi-objective optimisation. The Biohybrid results are obtained from experiments with four zebrafish and one robot driven by the model using the best optimised parameters.

3.1 Optimisation of Model Parameters

We define a similarity measure (ranging from 0.0 to 1.0) to compare two experiments (\(e_1\) and \(e_2\)), and define it as:

$$\begin{aligned} S(e_1, e_2) = \root 3 \of {I(O_{e_1}, O_{e_2}) I(T_{e_1}, T_{e_2}) I(D_{e_1}, D_{e_2})} \end{aligned}$$
(7)

with \(O_{e}\) the distribution of zones occupation, \(T_{e}\) the transition probabilities from zone e to the others, and \(D_{e}\) the distribution of inter-individual distances of all agents in zone e. The similarity measure \(S(e_1, e_2)\) corresponds to the geometric mean of these three features. The function I(PQ) is defined as such:

$$\begin{aligned} I(P, Q) = 1 - H(P, Q) \end{aligned}$$
(8)

The H(PQ) function is the Hellinger distance between two histograms [14]. It is defined as:

$$\begin{aligned} H(P, Q) = \frac{1}{\sqrt{2}} \sqrt{ \sum _{i=1}^{d} (\sqrt{p_i} - \sqrt{q_i} )^2 } \end{aligned}$$
(9)

We consider two optimization methods. In the Sim-MonoObj case, we use the CMA-ES [1] mono-objective optimisation algorithm, with the task of maximising the \(S_(e_1, e_2)\) function. In the Sim-MultiObj case, we use the NSGA-II [13] multi-objective algorithm with three objectives to maximise. The first objective is a performance objective corresponding to the \(S_{(e_1, e_2)}\) function. We also consider two other objectives used to guide the evolutionary process: one that promotes genotypic diversity [20] (defined by the mean euclidean distance of the genome of an individual to the genomes of the other individuals of the current population), the other encouraging behavioural diversity (defined by the euclidean distance between the \(O_{e}\), \(T_{e}\) and \(D_{e}\) scores of an individual). In both methods, we use populations of 60 individuals (approximately twice the number of dimensions of the problem) and 300 generations. The Sim-MonoObj stabilises around the 50-th generation. The Sim-MultiObj stabilises around the 250-th generation. The linear speed \(v_i\) of the agents is not optimized, and is randomly drawn from the instantaneous speed distribution measured in the control experiment. It should be noted that evolutionary algorithms do not over-fit (as it is an optimization process), even if we use the same data (trajectories) for both training and testing.

3.2 Robot Implementation

The robot is driven by the model described in Sect. 2.2, after calibration. Robotic trials have a duration of 15 min, and are repeated 10 times. They involve one robot and four zebrafish. Every 333 ms, we integrate the tracked positions of the four fish into the model, and compute the target position of a fifth agent. We then control the robot to follow this target position by using the biomimetic movement patterns described in [4, 9].

Fig. 4.
figure 4

Ethogram as finite state machine corresponding to the behavioural attractors for all agents. Each zones drive the agents into the corresponding behavioural attractor. Thus, agents modulate their behaviour in each zone as if they enter into a specific behavioural state. Here we show the resulting transition probabilities obtained after optimisation and implementation as robotic controllers (biohybrid) based on the experimental observations (control). The number in each state corresponds to the proportion of time agent spend in this state. The numbers on the arrows correspond to the transition probabilities between zones with a time-step of 1/3 s.

3.3 Model Performance Analysis and Experimental Validation

We assess the similarity between the results from the calibrated cases (Sim-MonoObj, Sim-MultiObj and Biobybrid) and those of the Control case by using the similarity measure defined in Sect. 3.1. The similarity scores are shown in Table 1.

Using information about zones occupation and probabilities of transition from one zone to another, we define a finite state machine corresponding to the behavioural attractors dynamics of the entire agent population. The resulting finite state machines obtained from the Control and Biohybrid cases are shown in Fig. 4. The probability of presence of an agent in each part of the arena is presented in Fig. 5A. Examples of agents trajectories are found in Fig. 5B.

The best-performing individuals of the Sim-MonoObj and Sim-MultiObj cases display distributions of inter-individual distances that are relatively close to those of the Control case, which suggests that these models can convincingly exhibit fish tendency to aggregate. However, of the two cases performed in simulation, only Sim-MultiObj is capable of displaying zones dynamics (occupation of the zones, and transition probabilities from one zone to the others) similar to the Control case. This suggests that multi-objective optimisation is required to handle the conflicting dynamics present in fish collective behaviour.

The robot of the Biohybrid case is driven by a controller using our model with the parameters of the best-performing individual obtained in the Sim-MultiObj. The results of the Biohybrid case correspond to those of the Sim-MultiObj case. The ethogram of the Biohybrid case (cf Fig. 4) shows an increased preference for the centre of the rooms compared to the Control case. This could be explained by our current lower level robotic implementation of wall-following behaviour that could still be sub-optimal.

Fig. 5.
figure 5

Panel A Probabilities of presence in each part of the arena, for all cases. Panel B Examples of trajectories over a duration of 2 min (1800 frames). In the Biohybrid case, the robot is in black.

Table 1. Similarity scores between the best-performing individuals of the three calibrated cases and the Control case used as reference, as defined in Sect. 3.1. We consider three standard features to characterise the collective behaviour exhibited in each case. Occupation corresponds to the probability of presence of the agent in each zone. Transitions corresponds to the probabilities of an agent to transition from one zone to another. Inter-individual distances corresponds to the distribution of inter-individual distances between all agents in a specific zone. The fitness function is computed as the geometric mean of these scores.

4 Discussion and Conclusion

Collective behaviour models often focus on collective motion in homogeneous unbounded environment. Here we present a multi-level model that is space-dependent with individuals that behave in a context-dependent way. We make the hypothesis that the type of behaviour displayed by the agents depends on their position in the environment. This allows us to segment our environment into several characteristic zones, each corresponding to a particular behavioural attractor, matching different types of agent behaviour.

We present a methodology to calibrate this model to correspond to the collective dynamics exhibited by fish in the experiments. This calibration process is challenging, as it involves a trade-off between social tendencies (group formation), and response to the environment (wall-following, exploration). Moreover, our model encompasses the notion of behavioural attractors, allowing agents to exhibit several different behaviours depending on the context. Our methodology is able to cope with this trade-off by using multi-objective optimisation.

However, this calibration methodology could still be improved: the similarity measure we use to compare two cases only takes into account three aspects of collective behaviours corresponding to behavioural attractors, and aggregation dynamics. Other behavioural aspects could also be relevant at the level of collective dynamics and can be considered: e.g.: agent groups aspects, residence time in a zone, at the level of the individuals e.g.: agent trajectory aspects, curvature of trajectories, etc. Moreover, in relation to the environment e.g.: the distance of an agent to the nearest wall could also be taken into account. Alternatively, it would be possible to perform the calibration without defining a similarity measure explicitly, using a method similar to [17], by co-evolving simultaneously the parameters of the models and classifiers. These classifiers would be trained to identify whether or not the resulting behaviours of the optimised models are distinct from the behaviours from the reference experiments.

Here, we make the assumption that the behavioural attractors are linked to the position of the agent in their environment. This assumption could be relaxed, to handle ethograms with more complex classes of behaviours like behavioural attractors linked to agent group dynamics. Additionally, the idea that actions are selected and segmented by the fish is questionable. While our decomposition of fish behaviour in different behavioural attractors is convenient for modelling purpose and ease the implementation of a biomimetic robot controller by having a collection of discrete acts that it can perform, it is not determined that fish make this kind of decomposition into distinct elements (actions) [6]. Finally, we could apply our model in more complex set-up, involving large societies with a larger number of robots, and with a more complex topology.