1 Introduction

Microscopic pedestrian models are frequently used in traffic engineering to predict crowd dynamics. Classical operational approaches are in general decision-based, velocity-based or force-based models (see [24] and the references therein). Such models consider physical as well as social or psychological factors. They are basic rules or generic functions depending locally on the environment. Few parameters having generally physical interpretations allow to adjust the model.

Before applying simulations to make predictions, the model parameters have to be calibrated and the models have to be validated, experimentally or statistically by using real data. The validation can be carried out by checking whether the models are able to describe the dynamics accurately for configurations different from the ones used for the calibration (cross-validation) [28]. A good model should provide realistic dynamics in different conditions (i.e. different geometries, initial or boundary conditions) for the same setting of the parameters.

The parameter for the convection part of the models (for instance, desired speed or time gap) can be referred to the fundamental diagram (FD), a phenomenological relation between speed and surrounding distance spacing to the neighbours and obstacles [25]. This relation can be explicitly used to model the speed of the pedestrian and is then related to optimal velocity, a concept borrowed from traffic modelling [2], see, e.g., [15, 18, 19]. It is also used as an implicit relation (see, e.g., [3, 10, 11]) that is determined by considering uni-dimensional flows [4]. By neglecting anisotropic effects in the models (such as the vision-based effect), microscopic models can be characterised at an aggregated level by fundamental diagrams determining a speed to a local density given by the mean distance spacing to the closest neighbours [6]. In the following we refer a model simply based on a fundamental diagram as FD-based model.

Despite of their simplicity, microscopic models can describe realistic pedestrian flows, as well as self-organisation phenomena such as lane formation or alternation of flow at a bottleneck in bi-directional streams [12, 24]. However, the prediction of pedestrian movement in complex spatial structures (e.g. buildings like stadia or stations) remains problematic. Observations show that pedestrians tend to adapt their behaviour according to the facilities [5]. For instance, the flow tends to locally increase at bottlenecks [20, 26, 30]. This leads to geometry-dependent characteristics and makes aggregated models based on a single fundamental diagram unable to accurately describe transitions between different types of facilities (such as corridor, T-junction, crossing or bottleneck), as well as from platforms to stairs.

An alternative data-driven approach for the prediction of pedestrian dynamics in complex geometries could be provided by using artificial neural networks (ANN). Neural networks have already proven their efficiency for motion planning in robotic or autonomous vehicles [13, 23], and start to be used for pedestrian dynamics as well [1, 6, 8, 16]. Such approach is data based and, by opposition to classical models, has artificially a very large number of parameters with no explicit physical meaning (see Fig. 1). ANN can reproduce complex patterns that FD-based models, describing dynamics at an aggregated level, cannot.

Fig. 1
figure 1

Minimalistic illustrative scheme for the distinction between FD-based models, which are explicit non-linear functions calibrated by few meaningful parameters, and neural networks, for which the non-linear function is data based and has deliberately a large number of parameters

The aim of this work is to evaluate whether ANN could accurately describe pedestrian behaviour for different types of geometries. A feed-forward neural network is compared to a FD-based model with data gained by experiments at bottleneck and corridor with closed boundary conditions (in the following ‘bottleneck’ and ‘ring’ experiments) [7, 27]. The performances (i.e. the fundamental diagram) significantly differ according to the spatial structure. Training and testing (cross-validation) are carried out for different combinations of bottleneck and ring experiments. The results show that the neural network is able to identify the spatial structure of the facility and improve the prediction in case of mixed structures. The data and the models used are presented in Sects. 2 and 3. The fitting of the neural network is proposed in Sect. 4 while the comparison to the FD-based model over combinations of bottleneck and ring experiments is given in Sect. 5. Conclusions are presented in Sect. 6.

2 Models

The pedestrian modelling approaches are continuous speed models based on the K = 10 closest neighbours. We denote in the following (x, y) as the position of the considered agent, v as its speed and ((x i, y i), i = 1, …, K) as the positions of the K closest neighbours.

The first modelling approach is the Weidmann model for the fundamental diagram [29] for which the speed is a function of the mean spacing (i.e. the local density)

$$\displaystyle \begin{aligned} v=\text{FD}(\bar s_K,v_0,T,\ell)=v_0\Big(1-e^{\frac{\ell-\bar s_K}{v_0T}}\Big). \end{aligned} $$
(1)

Here \(\bar s_K=\frac 1K\sum _{i=1}^K\sqrt {(x-x_i)^2+(y-y_i)^2}\) is the mean spacing with the K = 10 closest neighbours while the time gap T, the pedestrian size and the desired speed v 0 are the parameters of the model.

The second modelling approach is a feed-forward artificial neural network with hidden layers H

$$\displaystyle \begin{aligned} v=\text{NN}\big(H,\bar s_K,(x_i-x,y_i-y,1\le i\le K)\big). \end{aligned} $$
(2)

The network has 2K + 1 inputs: the mean spacing and the K relative positions. The number of parameters in the network depends on the number of nodes in the hidden layers.

3 Data

Two datasets obtained in laboratory conditions are used to compare the FD-based and ANN modelling approaches. The data are available on the internet (see [27]). They are part of the online database of pedestrian experiments [7]. The first dataset, denoted R and called the ring experiment, comes from an experiment done on a closed geometry of length 30 m and width 1.8 m for different density levels (ranging from 0.25 to 2 ped/m2—the participant number ranges from 15 to 230). The second dataset, denoted B, is an experiment for a bottleneck geometry. The width of the system in front of the bottleneck is 1.8 m while the width of the bottleneck varies (from 0.70, 0.95, 1.20 to 1.80 m—150 participants by experiment). See [27] for details on the data. The flow and density are measured every 10 s to deal with pseudo-independent measurements. Each sample contains around N = 2000 observations.

The two datasets describe two different interaction behaviours (see Fig. 2). The speed for a given mean spacing tends to be higher in the bottleneck than on the ring when the system is congested. Consequently the estimation of the time gap significantly differs according to the geometry (see Table 1).

Fig. 2
figure 2

Observations of the pedestrian speeds as function of the mean spacing with the ten closest neighbours for the ring and bottleneck experiments. Two distinct relationships can be identified

Table 1 Fitting of the time gap T, the pedestrian size and the desired speed v 0 parameters of the Weidmann model on the ring and bottleneck experiment

4 Fitting the Neural Network

The neural network is fitted with cross-validation and bootstrap [14, 17] over 50 sub-samples. The training is performed using half of the data while the network is tested on the remaining. The training is carried out with the back-propagation method [22] on the normalised data, by minimising from the top to the bottom of the network the mean square error

$$\displaystyle \begin{aligned} \text{MSE}=\frac 1N\sum_i\big(v_i-\tilde v_i\big)^2, \end{aligned} $$
(3)

with v i the observed speed, \(\tilde v_i\) the predicted speed and N the number of observations. The computation is done with the statistical software R [21] and the package neuralnet [9].

The different hidden layers (1), (2), (3), (4,2), (5,2), (5,3), (6,3), (10,4) and (12,5) are tested (see Fig. 3). As expected, the training error tends to decrease as the complexity of the network increases, while the testing error shows a minimum before overfitting. Such a minimum is reached for the single hidden layer H =  (3) with three nodes. Note that here the calibration is done on a combination of the ring and bottleneck experiment datasets. Yet similar results are obtained when calibrating separately on the ring and on the bottleneck datasets.

Fig. 3
figure 3

Training and testing errors according to different hidden layer in the network. The curves correspond to the average of 50-bootstrap estimates while the bands describe the standard deviation. The training error systematically decreases as network complexity increases while the testing error admits a minimum for hidden H =  (3)

5 Model Comparison

The calibrated FD-model and the trained neural network with H = 3 are compared to different combinations of data of the ring R and bottleneck B experiments. Here the first argument in the notation ‘. / .’ corresponds to the training phase, while the second argument corresponds to the testing phase. The testing errors are presented in Fig. 4. The modelling approaches are firstly analysed on the identical sets R∕R and B∕B. Here the network is slightly better than the FD-model. For the ring experiment, the performances are relatively homogeneous and the MSE is only approximately 5% smaller by using the network. While for the bottleneck, the performances fluctuate more and the improvement is more significant (around 15%). The improvement is also significant when the approaches deal with unobserved situations, i.e. for the datasets R∕B and B∕R (around 15%). The best results are obtained when training the models on the combination of ring and bottleneck experiments, i.e. the scenarios R∕R+B, B∕R+B and R+B∕R+B. As shown in Fig. 5 and Table 2, the network is able in such situation to partially differentiate the two geometries and to improve the speed estimation by a factor of approximately 20%. The orders of improvement are similar to the ones obtained in [1] with the social LSTM neural network and the social force pedestrian model [11].

Fig. 4
figure 4

Testing error of the FD-model and the neural network according to combinations of the ring R and bottleneck B experiments. The curves correspond to the average of 50-bootstrap estimates while the bands describe the standard deviation. The improvement of the speed is significant by using the network when the experiments are mixed (i.e. R+B)

Fig. 5
figure 5

Prediction by the neural network of the pedestrian speed for the R+B∕R+B training and testing datasets. The network is able to, at least partially, identify the two geometries. As observed in the real data, the speed for a given mean spacing is in average in the bottleneck higher than the flow in the corridor for congested situations

Table 2 Fitting of the time gap T, the pedestrian size and the desired speed v 0 parameters for the data predicted by the neural network

6 Conclusion

The data-driven approach using an artificial neural network is able to distinguish pedestrian performances in ring and bottleneck experiments from the relative positions of the K = 10 closest neighbours and the mean spacing. Consequently, we observe that the speed prediction for mixed data can be improved by a factor up to 20% by using a network compared to an aggregated model based on fundamental diagrams.

The results are first steps suggesting that neural networks could be suitable tools for the prediction of pedestrian dynamics in complex geometries. Yet, the simulation of the networks remains to be carried out over full trajectories and compared to the performances obtained with existing models and notably anisotropic models. Furthermore, other inputs, hidden layers and training on different geometries have to be investigated. Especially, one remains to test the complexity necessary to the network for accurate precisions regarding to the size and heterogeneity of the datasets.