Keywords

1 Introduction

Autonomous, biomimetic robots can serve as tools in animal behavioural studies. Robots are used in ethology and behavioural studies to untangle the multimodal modes of interactions and communication between animals [23]. When they are socially integrated in a group of animals, they are capable of sending calibrated stimuli to test the animal responses in a social context [17]. Moreover, animal and autonomous robot interactions represent an interesting challenge for robotics. Confronting robots to animals is a difficult task because specific behavioural models have to be designed and the robots have to be socially accepted by the animals. The robots have to engage in social behaviour and convince somehow the animal that they can be social companions. In this context, the capabilities of the robots and their intelligence are put in harsh conditions and often demonstrate the huge gap that still exists between autonomous robots and animals not only considering motion and coping with the environment but also in terms of intelligence. It is a direct comparison of artificial and natural collective intelligence. Moreover, the design of such social robots is challenging as it involves both a luring capability including appropriate robot behaviours, and the social acceptation of the robots by the animals. We have shown that the social integration of robots into groups of fish can be improved by refining the behavioural models used to build their controllers [8]. The models have also to be calibrated to replicate accurately the animal collective behaviours in complex environments [8].

Research on animal and robot interactions need also bio-mimetic formal models as behavioural controllers of the robots if the robots have to behave as congeners [2, 3]. Robots controllers have to deal with a whole range of behaviours to allow them to take into account not only the other individuals but also the environment and in particular the walls [7, 8]. However, most of biological collective behaviour models deal only with one sub-part at a time of fish behaviours in unbounded environments. Controllers based on neural networks, such as multilayer perceptron (MLP) [22] or echo state networks (ESN) [20] have the advantage to be easier to implement and could deal with a larger range of events.

Objectives

We aim at building models that generate accurately zebrafish trajectories of one individual within a small group of 5 agents. The trajectories are the result of social interactions in a bounded environment. Zebrafish are a classic animal model in the research fields of genetics and neurosciences of individual and collective behaviours. Building models that correctly reproduce the individual trajectories of fish within a group is still an open question [18]. We explore MLP and ESN models, optimised by evolutionary computation, to generate individual trajectories. MLP and ESN are black-box models that need few a priori information provided by the modeller. They are optimised on the experimental data and as such represent a model of the complex experimental collective trajectories. However, they are difficult to calibrate on the zebrafish experimental data due to the complexity of the fish trajectories. Here, we consider the design and calibration by evolutionary computation of neural network models, MLP and ESN, that can become robot controllers. We test two evolutionary optimisation methods, CMA-ES [1] and NSGA-III [33] and show that the latter gives better results. We show that such MLP and ESN behavioural models could be useful in animal robot interactions and could make the robots accepted by the animals by reproducing their behaviours and trajectories as in [8].

2 Materials and Methods

2.1 Experimental Set-Up

We use the same experimental procedure, fish handling, and set-up as in [2, 4, 6, 8, 10, 28]. The experimental arena is a square white plexiglass aquarium of \(1000\times 1000\times 100\) mm. An overhead camera captures frames at 15 FPS, with a \(500\times 500\) px resolution, that are then tracked to find the fish positions. We use 10 groups of 5 adults wild-type AB zebrafish (Danio rerio) in 10 trials lasting each one for 30-min as in [2, 4, 6, 8, 10, 28]. The experiments performed in this study were conducted under the authorisation of the Buffon Ethical Committee (registered to the French National Ethical Committee for Animal Experiments #40) after submission to the French state ethical board for animal experiments.

Fig. 1.
figure 1

Methodology workflow. An evolutionary algorithm is used to evolve the weight of a MLP (1 hidden layer, 100 neurons) or an ESN (100 reservoir neurons) neural networks that serves as the controller of a simulated robot interacting with 4 fish described by the experimental data. Only the connections represented by dotted arrows are evolved (for MLP: all connections; for ESN: connections from inputs to reservoir, from reservoir to outputs and from outputs to outputs and to reservoir). The fitness function is computed through data-analysis of these simulations and represent the biomimetism metric of the simulated robot behaviour compared to the behaviour exhibited by real fish in experiments. Two evolutionary algorithms are tested: CMA-ES (mono-objective) and NSGA-III (multi-objective).

2.2 Artificial Neural Network Model

Black-box models, like artificial neural networks (ANN), can be used to model phenomena with few a priori information. Although they are not used yet to model fish collective behaviours based on experimental data, here we show that they are relevant to model zebrafish collective behaviour. We propose a methodology (Fig. 1) where either a multilayer perceptron (MLP) [22] artificial neural network, or an echo state network (ESN) [20], is calibrated through the use of evolutionary algorithms to model the behaviour of a simulated fish in a group of 5 individuals. The 4 other individuals are described by the experimental data obtained with 10 different groups of 5 fish for trials lasting 30 min.

MLP are a type of feedforward artificial neural networks that are very popular in artificial intelligence to solve a large variety of real-world problems [25]. Their capability to universally approximate functions [11] makes them suitable to model control and robotic problems [25]. We consider MLP with only one hidden layer of 100 neurons (using a hyperbolic tangent function as activation function).

ESN are recurrent neural networks often used to model temporal processes, like time-series, or robot control tasks [26]. They are sufficiently expressive to model complex non-linear temporal problems, that non-recurrent MLP cannot model.

For the considered focal agent, the neural network model takes the following parameters as input: (i) the direction vector (angle and distance) from the focal agent towards each other agent; (ii) the angular distance between the focal agent direction and each other agent direction (alignment measure); (iii) the direction vector (angle and distance) from the focal agent towards the nearest wall; (iv) the instant linear speed of the focal agent at the current time-step, and at the previous time-step; (v) the instant angular speed of the focal agent at the current time-step, and at the previous time-step. This set of inputs is typically used in multi-agent modelling of animal collective behaviour [13, 30]. As a first step, we consider that it is sufficient to model fish behaviour with neural networks.

The neural network has two outputs corresponding to the change in linear and angular speeds to apply from the current time-step to the next time-step. Here, we limit our approach to modelling fish trajectories resulting from social interactions in a homogeneous environment but bounded by walls. Very few models of fish collective behaviours take into account the presence of walls [5, 9].

2.3 Data Analysis

For each trial, e, and simulations, we compute several behavioural metrics using the tracked positions of agents: (i) the distribution of inter-individual distances between agents (\(D_e\)); (ii) the distributions of instant linear speeds (\(L_e\)); (iii) the distributions of instant angular speeds (\(A_e\)); (iv) the distribution of polarisation of the agents in the group (\(P_e\)) and (v) the distribution of distances of agents to their nearest wall (\(W_e\)). The polarisation of an agent group measures how aligned the agents in a group are, and is defined as the absolute value of the mean agent heading: \(P = \frac{1}{N} \bigl |\sum ^{N}_{i=1} u_i \bigr |\) where \(u_i\) is the unit direction of agent i and \(N=5\) is the number of agents [32].

We define a similarity measure (ranging from 0.0 to 1.0) to measure the biomimetism of the simulated robot behaviour by comparing the behaviour of the group of agents in simulations where the robot is present (experiment \(e_r\): four fish and one robot) to the behaviour of the experimental fish groups (experiment \(e_c\): five fish):

$$\begin{aligned} S(e_r, e_c) = \root 5 \of {I(D_{e_r}, D_{e_c}) I(L_{e_r}, W_{e_c}) I(A_{e_r}, O_{e_c}) I(P_{e_r}, T_{e_c}) I(W_{e_r}, T_{e_c})} \end{aligned}$$
(1)

The function I(XY) is defined as such: \(I(X, Y) = 1 - H(X, Y)\). The H(XY) function is the Hellinger distance between two histograms [14]. It is defined as: \(H(X, Y) = \frac{1}{\sqrt{2}} \sqrt{ \sum _{i=1}^{d} (\sqrt{X_i} - \sqrt{Y_i} )^2 }\) where \(X_i\) and \(Y_i\) are the bin frequencies.

This score measures the social acceptation of the robot by the fish, as defined in [7, 8]. Compared to the similarity measure defined in these articles, we added a measure of the polarisation of the agents. This was motivated by the tendency of our evolved neural models, without a polarisation factor, to generate agents with unnatural looping behaviour to catch up with the group.

2.4 Optimisation

We calibrate the ANN models presented here to match as close as possible the behaviour of one fish in a group of 5 individuals in 30-min simulations (at 15 time-steps per seconds, i.e. 27000 steps per simulation). This is achieved by optimising the connection weights of the ANN through evolutionary computation that iteratively perform global optimisation (inspired by biological evolution) on a defined fitness function so as to find its maxima [21, 27].

We consider two optimisation methods (as in [7]), for MLP and ESN networks. In the Sim-MonoObj-MLP case, we use the CMA-ES [1] mono-objective evolutionary algorithm to optimise an MLP, with the task of maximising the \(S_(e_1, e_2)\) function. In the Sim-MultiObj-MLP and Sim-MultiObj-ESN cases, we use the NSGA-III [33] multi-objective algorithm with three objectives to maximise. The first objective is a performance objective corresponding to the \(S_(e_1, e_2)\) function. We also consider two other objectives used to guide the evolutionary process: one that promotes genotypic diversity [24] (defined by the mean euclidean distance of the genome of an individual to the genomes of the other individuals of the current population), the other encouraging behavioural diversity (defined by the euclidean distance between the \(D_{e}\), \(L_{e}\), \(A_{e}\), \(P_{e}\) and \(W_{e}\) scores of an individual). The NSGA-III algorithm was used with a \(0.80\%\) probability of crossovers and a \(0.20\%\) probability of mutations (we also tested this algorithm with only mutations and obtained similar results). The NSGA-III algorithm [33] is considered instead of the NSGA-II algorithm [12] employed in [7] because it is known to converge faster than NSGA-II on problems with more than two objectives [19].

In both methods, we use populations of 60 individuals and 300 generations. Each case is repeated in 10 different trials. We use a NSGA-III implementation based on the DEAP python library [16].

Fig. 2.
figure 2

Similarity scores between the behaviour of the experimental fish groups (control) and the behaviour of the best-performing simulated individuals of the MLP models optimised by CMA-ES or NSGA-III. Results are obtained over 10 different trials (experiments for fish-only groups, and simulations for NN models). We consider five behavioural features to characterise exhibited behaviours. Inter-individual distances corresponds to the similarity in distribution of inter-individual distances between all agents and measures the capabilities of the agents to aggregate. Linear and Angular speeds distributions correspond to the distributions of linear and angular speeds of the agents. Polarisation measures how aligned the agents are in the group. Distances to nearest wall corresponds to the similarity in distribution of agent distance to their nearest wall, and assess their capability to follow the walls. The Biomimetic score corresponds to the geometric mean of the other scores.

Fig. 3.
figure 3

Comparison between 30-min trials involving 5 fish (control, biological data) and simulations involving 4 fish and 1 robot, over 10 trials and across 5 behavioural features: inter-individual distances (A), linear (B) and angular (C) speeds distributions, polarisation (D), and distances to nearest wall (E).

3 Results

We analyse the behaviour of one simulated robot in a group of 4 fish. The robots are driven by ANN (either MLP or ESN) evolved with CMA-ES (Sim-MonoObj-MLP case) or with NSGA-III (Sim-MultiObj-MLP and Sim-MultiObj-ESN cases) and compare it to the behaviour of fish-only groups (Control case). We only consider the best-evolved ANN controllers. In the simulations, the simulated robot does not influence the fish because the fish are described by their experimental data that is replayed.

Examples of agent trajectories obtained in the three tested cases are found in Fig. 4A. In the Sim-MonoObj-MLP and Sim-MultiObj-* cases, they correspond to the trajectory of the simulated robot agent. In both case, we can see that the robot follow the walls like the fish, and are often part of the fish group as natural fish do. However, the robot trajectories can incorporate patterns not found in the fish trajectories. For example, small circular loop are done when the robot performs an U-turn to catch up with the fish group. This is particularly present in the Sim-MonoObj-MLP case, and seldom appear in the Sim-MultiObj-* cases.

Fig. 4.
figure 4

Agent trajectories observed after 30-min trials in a square (1 m) aquarium, for the 4 considered cases: Control reference experimental fish data obtained as in [9, 28], Sim-MonoObj-MLP MLP optimised by CMA-ES, Sim-MultiObj-MLP MLP optimised by NSGA-III, Sim-MultiObj-ESN ESN optimised by NSGA-III. A Examples of an individual trajectory of one agent among the 5 making the group (fish or simulated robot) during 1-min out of a 30-min trial. B Presence probability density of agents in the arena.

We compute the presence probability density of agents in the arena (Fig. 4B): it shows that the robot tend to follow the walls as the fish do naturally.

For the three tested cases, we compute the statistics presented in Sect. 2.3 (Fig. 3). The corresponding similarity scores are shown in Fig. 2. The results of the Control case shows sustained aggregative and wall-following behaviours of the fish group. Fish also seldom pass through the centre of the arena, possibly in small short-lived sub-groups. There is group behavioural variability, especially on aggregative tendencies (measured by inter-individual distances), and wall-following behaviour (measured by the distance to the nearest wall), because each one of the 10 groups is composed of different fish i.e. 50 fish in total.

The similarity scores of the Sim-MultiObj-* cases are often within the variance domain of the Control case, except for the inter-individual score. It suggests that groups incorporating the robot driven by an MLP evolved by NSGA-III exhibit relatively similar dynamics as a fish-only group, at least according to our proposed measures. However, it is still perfectible: the robot is sometimes at the tail of the group, possibly because of gap created between the robot and the fish group by small trajectories errors (e.g. small loops shown in robot trajectories in Fig. 4A).

The Sim-MonoObj-MLP case sacrifices biomimetism to focus mainly on group-following behaviour: this translated into a higher inter-individual score than in the Sim-MultiObj-* cases, and robot tend to follow closely the fish group. With Sim-MonoObj-MLP, the robot is going faster than the fish, and will fastly go back towards the centroid of the group if it is too far ahead of the group: this explains the large presence of loops in Fig. 4A. The Sim-MonoObj-MLP does not take into account behavioural diversity like the Sim-MultiObj-*, but focus on the one that is easier to find (namely the group-following behaviour) and stays stuck in this local optimum.

There are few differences between the results of the Sim-MultiObj-MLP and the Sim-MultiObj-ESN cases, the latter showing often slightly lower scores than the former. However, the Sim-MultiObj-ESN displays a large variability of inter-individual scores, which could suggest that its expressivity could be sufficient to model agents with more biomimetic behaviours if the correct connection weights were found by the optimiser.

4 Discussion and Conclusion

We evolved artificial neural networks (ANN) to model the behaviour of a single fish in a group of 5 individuals. This ANN controller was used to drive the behaviour of a robot agent in simulations to integrate the group of fish by exhibiting biomimetic behavioural capabilities. Our methodology is similar to the calibration methodology developed in [7], but employs artificial neural networks instead of an expert-designed behavioural model. Artificial neural networks are black-box models that require few a-priori information about the target tasks.

We design a biomimetism score from behavioural measures to assess the biomimetism of robot behaviour. In particular, we measure the aggregative tendencies of the agents (inter-individual distances), their disposition to follow walls, to be aligned with the rest of the group (polarisation), and their distribution of linear and angular speeds.

However, finding ANN displaying behaviours of appropriate levels of biomimetism is a challenging issue, as fish behaviour is inherently multi-level (tail-beats as motor response vs individual trajectories vs collective dynamics), multi-modal (several kinds of behavioural patterns, and input/output sources), context-dependent (different behaviours depending on the spatial position and proximity to other agents) and stochastic (leading to individual and collectives choices and action selection) [9, 29]. More specifically, fish dynamics involve trade-offs between social tendencies (aggregation, group formation), and response to the environment (wall-following, zone occupation); they also follow distinct movement patterns that allow them to move in a polarised group and react collectively to environmental and social cues.

We show that this artificial neural models can be optimised by using evolutionary algorithms, using the biomimetism score of robot behaviour as a fitness function. The best-performing evolved ANN controllers show competitive biomimetism scores compared to fish group behavioural variability. We demonstrate that taking into account genotypic and behavioural diversity in the optimisation process (through the use of the global multi-objective optimiser NSGA-III) improve the biomimetic scores of the evolved best-performing controllers. The ANN models evolved through mono-objective optimisation tend to focus more on evolving a group-following behaviour rather than a biomimetic agent.

Our approach is still perfectible, in particular, we only evolve the behaviour of a single agent in a group, rather than all agents of the group. This choice was motivated by the large increase in difficulty in evolving ANN models for the entire group, which would also involve additional behavioural trade-offs: e.g. individual free-will and autonomous dynamics, individuals leaving or re-joining the group. However, it also means that here the fish do not react to the robot in simulations because the fish behaviour is a replay of fish experimental trajectories recorded without robot.

Additionally, it may be possible to improve the performance (in term of biomimetism) of the multi-objective optimisation process by combining additional selection pressures as objectives (i.e. not just genotypic and behavioural diversity) [15]. We already include behavioural and phenotypic diversities as selection pressures to guide the optimisation process; however, taking into account phenotypic diversity can bias the optimisation algorithm to explore rather than exploit, which can prevent some desired phenotypes to be considered by the optimisation algorithm. An alternative would be to use angular diversity instead [31].

This study shows that ANN are good candidates to model individual and collective fish behaviours, in particular in the context of social bio-hybrid systems composed of animals and robots. By evolutionary computation, they can be calibrated on experimental data. This approach requires less a priori knowledge than equations or agent based modelling techniques. Although they are black box model, they could also produce interesting results from a biological point of view. Thus, ANN collective behaviour models can be an interesting approach to design animal and robot social interactions.