Keywords

1 Introduction

The study of pedestrian dynamics has important applications in crowd management such as devising strategies for the evacuation of buildings or public places. In order to evaluate the predictive power of mathematical models designed to emulate human crowd behavior, it is a common procedure to compare numerical simulations based on these models with empirical data. These empirical data are usually extracted from video recordings of either naturally occurring human crowds [7] or pedestrian flows that have been produced by controlled experiments [2, 3, 23]. In general, the latter are devised to demonstrate crowd behavior in special situations such as evacuation or passing through a bottleneck. In our work, we put a particular emphasis on analyzing intersecting pedestrian flows, and in Sect. 2 we describe experiments that were conducted with this purpose.

Furthermore, different modeling approaches demand the extraction of different types of data: For example, the social force model [6] and the cellular automaton model [1] aim at predicting pedestrian trajectories, whereas continuum methods [8] adopted from fluid mechanics describe the dynamics via a density and flow field associated with the crowd. The computation of the density on a large range of spatial scale from a crowd consisting of only a few pedestrians is a challenging task because of the low number of samples. In this context, we propose a variable-width kernel density estimation described in Sect. 3 and apply this algorithm to our experimental data (Sect. 4).

We conclude with a short summary and an overview of remaining problems in Sect. 5.

2 Experiments

In the experiment which we use here to illustrate our method, two pedestrian flows (group A, 142 subjects, and group B, 83 subjects) intersected at an angle of 90 for 1 min in a region of about 25 m2, reaching a peak density of about five pedestrians per square meter. The scene was recorded from a gallery at a height of about 6 m with five networked and temporally synchronized JVC VN-V25U surveillance video cameras. Here, we analyze the data provided by the three central cameras which covered the area where the actual intersecting of the pedestrian flows took place, see Fig. 1. A similar experiment with this purpose has been conducted in [4]. However, we process data from multiple cameras, and therefore, a larger field of view covering a larger portion of the observation area than one camera view could be captured. Also, in our experiments, the pedestrians did not move along specified, confined corridors.

The cameras were calibrated by applying a pinhole model to the world and image coordinates of about 30 reference points in the scene. After camera calibration, the spatio-temporal positions of the pedestrians were extracted by photogrammetric means—a particular challenge was presented by the fact that due to constructional limitations the scene could not be captured from a bird’s eye view. For more details, we refer to [14, 15]. First, for each frame, the heads of the pedestrians were marked manually, aided by the Lucas–Kanade tracking algorithm [11, 18]. Then, for each pedestrian the floor position was marked in (at least) one frame in order to compute the height of the respective pedestrian via the homography determined from camera calibration. This information is sufficient to calculate the pedestrians’ world coordinates above ground for each frame. Smooth trajectories were then obtained via approximation with cubic B-splines. Finally, the trajectories extracted from different cameras were merged by combinatorial assignment with the Kuhn–Munkres algorithm, also known as the Hungarian method [9, 13].

The experimental data, i.e., videos and extracted trajectories, can be downloaded at http://www.math.tu-berlin.de/projekte/smdpc/.

Fig. 1
figure 1

Intersecting pedestrian flows. Red diamonds: group A, blue crosses: group B

3 Density and Flow Estimation

Empirical data of human crowd behaviors are usually represented by the trajectories of the pedestrians. Probably the most basic way to compute a density from such trajectories would be to divide the number of pedestrians in a given region by the area of that region, at a given point in time. However, this “standard” density estimator yields data with large scatter—let alone a smooth density function defined at every point. Very similar problems occur when estimating the flow by counting pedestrians that pass through a given cross section.

At least two approaches for computing the density have been suggested in the literature as alternatives:

  1. 1.

    In [7], a local density field is computed via the sum of Gaussians with a fixed standard deviation (typically 0. 7 m) centered at each pedestrian. Formally, this approach may be recognized as a kernel density estimation with fixed bandwidth, which is a basic tool in statistical data analysis (see [19], for example). This method results in a smooth density field defined at every point. Of course, the kernel estimator yields the same result as the standard density when spatially averaged across large regions. However, for isolated pedestrian groups of “mesoscopic” size one typically observes values that are significantly lower than the standard density since a large portion of the “pedestrian mass” is located outside of the respective region.

  2. 2.

    The authors of [20] propose two similar estimators, both of which are based on the Voronoi diagram defined by each pedestrian’s position as a Voronoi site. The Voronoi method has been successfully applied to study pedestrian flows through observational areas that feature a constrained geometry, such as corridors or the vicinity of bottlenecks, see also [23]. The main idea in this approach is to account for the personal space occupied by each pedestrian, and this personal space is represented by the area of the corresponding Voronoi cell. The values for the Voronoi density are very close to standard densities, but with significantly less scatter. However, the Voronoi estimator does not yield a smooth local density function defined at every point. Also, for sparse and unconstrained crowds, a significant number of Voronoi cells may be quite large (in fact, may even have infinite area) resulting in densities that are lower than expected. However, a more recent suggestion is to mitigate this problem by introducing a fixed limit to the size of the cells [10]. In this paper, we only consider the original definition for the Voronoi density that is denoted in [20] as “D V ”.

In this work, we propose yet another method, based on kernel estimation with a variable bandwidth defined by Eq. (3) below. This method is conceptually a blend of the Voronoi estimator (accounting for personal space) and the fixed-bandwidth kernel estimator (yielding smooth density fields).

3.1 Standard and Voronoi Density

At a given point in time t, suppose we observe a (large) number N of pedestrians labeled by some suitable index set J such that | J |  = N. The positions of these pedestrians are denoted by \(\boldsymbol{x}_{j}(t) \in {\mathbb{R}}^{2}\), j ∈ J. We may then define a local density distribution

$$\displaystyle{ \rho (t,\boldsymbol{x}) =\sum _{i\in I}k_{i}(t,\boldsymbol{x}) =\sum _{i\in I}\delta (\boldsymbol{x} -\boldsymbol{ x}_{i}(t)) }$$
(1)

where δ denotes the Dirac delta function. The index set I ⊂ J labels the pedestrians of interest—for example, the whole crowd, in which case I = J, or a group of pedestrians with a common destination. For this density field—which is obviously not smooth but highly singular—the spatial average across some region \(\varOmega \subset {\mathbb{R}}^{2}\) is simply given by the number of pedestrians contained in Ω divided by the area of Ω:

$$\displaystyle{\langle \rho (t,\,\cdot \,)\rangle _{\varOmega } = \frac{1} {\vert \varOmega \vert }\int _{\varOmega }\rho (t,\boldsymbol{y})\,\mathrm{{d}}^{2}y = \frac{1} {\vert \varOmega \vert }\sum _{i\in I}\int _{\varOmega }\delta (\boldsymbol{y} -\boldsymbol{ x}_{i}(t))\,\mathrm{{d}}^{2}y = \frac{\vert I_{\varOmega }(t)\vert } {\vert \varOmega \vert } }$$

with \(I_{\varOmega }(t):=\{ j \in I\vert \boldsymbol{x}_{j}(t) \in \varOmega \}\). We will refer to this way of computing the density as the standard density.

The Voronoi method also defines a local density field which—like the standard density, or the kernel density defined in the next subsection—may be written in the form \(\rho (t,\boldsymbol{x}) =\sum _{i\in I}k_{i}(t,\boldsymbol{x})\). In this case, we have

$$\displaystyle{k_{i}(t,\boldsymbol{x}) = \left \{\begin{array}{@{}l@{\quad }l@{}} \frac{1} {\vert V _{i}(t)\vert }\quad &\mbox{ if }\boldsymbol{x} \in V _{i}(t), \\ 0 \quad &\mbox{ if }\boldsymbol{x}\not\in V _{i}(t), \end{array} \right.}$$

where V i (t) is the Voronoi cell the seed of which is given by \(\boldsymbol{x}_{i}(t)\). Obviously, this local density field is, in general, not continuous. Also note that without the introduction of some upper bound to the size of the cells, the local Voronoi density is not properly normalized, i.e., \(\int _{{\mathbb{R}}^{2}}\rho (t,\boldsymbol{y})\,\mathrm{{d}}^{2}y\neq \vert I\vert \).

3.2 Kernel Density and Flow Estimators

In order to obtain a smooth density field instead of the distribution given by Eq. (1), one way is to replace the Dirac distribution by suitable nascent Dirac functions. One may understand this technique as a smoothing procedure that replaces the singular Dirac peaks by peaks of finite height and non-zero width. The main difficulty is given by the problem to choose appropriate widths for the new peaks. To this end, consider a kernel pedestrian density estimator with isotropic kernel function:

$$\displaystyle{\rho (t,\boldsymbol{x}) =\sum _{i\in I}k_{i}(t,\boldsymbol{x}) =\sum _{i\in I} \frac{1} {{(\lambda d_{i}(t))}^{2}} \cdot K\left (\frac{\boldsymbol{x} -\boldsymbol{ x}_{i}(t)} {\lambda d_{i}(t)} \right ).}$$

The dimensionless number λ > 0 is a global smoothing parameter. In the following, we will always assume λ = 1, and that the kernel is given by a Gaussian function:

$$\displaystyle{K(\boldsymbol{y}) = \frac{1} {2\pi }\exp \left (-\frac{\|\boldsymbol{{y}\|}^{2}} {2} \right ).}$$

Probably the most natural formula for computing a corresponding flow field would be given by:

$$\displaystyle{ \boldsymbol{j}_{m}(t,\boldsymbol{x}) =\sum _{i\in I}k_{i}(t,\boldsymbol{x}) \cdot \frac{\mathrm{d}\boldsymbol{x}_{i}(t)} {\mathrm{d}t}. }$$
(2)

However, the continuity equation is generally not satisfied with \(\boldsymbol{j}_{m}\) as the only flow component. Instead, we have

$$\displaystyle{\frac{\partial \rho (t,\boldsymbol{x})} {\partial t} +\mathrm{ div}(\boldsymbol{j}_{m}(t,\boldsymbol{x}) +\boldsymbol{ j}_{c}(t,\boldsymbol{x})) = 0}$$

with an additional, irrotational flow

$$\displaystyle{\boldsymbol{j}_{c}(t,\boldsymbol{x}) =\sum _{i\in I}k_{i}(t,\boldsymbol{x}) \cdot \frac{\mathrm{d}\,\mathrm{ln}(d_{i}(t))} {\mathrm{d}t} \cdot (\boldsymbol{x} -\boldsymbol{ x}_{i}(t)).}$$

Note that \(\boldsymbol{j}_{c}\) vanishes if, for each label i ∈ I, the corresponding bandwidth d i (t) is fixed, i.e., it does not depend on the point in time t. Thus, if we wish to enforce a law of “pedestrian mass conservation”, a sensible choice for the total flow is given by \(\boldsymbol{j} =\boldsymbol{ j}_{m} +\boldsymbol{ j}_{c}\). Furthermore, Eq. (2) also applies to the limiting case d i  → 0, i.e., the standard density, to provide a standard flow. The spatial average of the standard flow with respect to some region Ω is simply given by the sum of the pedestrians’ individual velocities divided by the area of that region:

$$\displaystyle{\frac{1} {\vert \varOmega \vert }\sum _{i\in I}\int _{\varOmega }\delta (\boldsymbol{y} -\boldsymbol{ x}_{i})\,\mathrm{{d}}^{2}y \cdot \frac{\mathrm{d}\boldsymbol{x}_{i}(t)} {\mathrm{d}t} = \frac{1} {\vert \varOmega \vert }\sum _{i\in I_{\varOmega }(t)}\frac{\mathrm{d}\boldsymbol{x}_{i}(t)} {\mathrm{d}t}.}$$

As for the choice of bandwidth, one may assume it to be fixed—for example, d i (t) ≡ 0. 7 m [7]. More generally, the numbers d i (t) can be computed from the current positions of the pedestrians in a suitable way—a formal analogue of this procedure in statistical data analysis is known as a sample smoothing estimator [21]. For the nearest-neighbor kernel estimator we previously proposed in [14], we have:

$$\displaystyle{d_{i}(t) =\min _{j\in J,\,j\neq i}(\|\boldsymbol{x}_{i}(t) -\boldsymbol{ x}_{j}(t)\|).}$$

Note, however, that in general, the functions td i (t) defined in this way are not differentiable, yielding a density field that is not smooth with respect to the time variable t. Therefore, for some fixed additional parameter \(p \in \mathbb{R}\), p > 1, we propose the following functions instead (cf. [15]):

$$\displaystyle{ d_{i}^{(p)}(t) ={ \left (\sum _{ j\in J,\,j\neq i}{(\|\boldsymbol{x}_{i}(t) -\boldsymbol{ x}_{j}(t)\|)}^{-p}\right )}^{-\frac{1} {p} }. }$$
(3)

The bandwidths thus defined are smooth functions and at the same time generalize the nearest-neighbor kernel as its limiting case of p → . In the following, when we speak of variable bandwidth, we will always assume that Eq. (3) applies, with p = 4. Figure 2 shows a toy-model calculation for a single pedestrian: The bandwidth decreases with the number of pedestrians located in the near vicinity, and their respective distance. We interpret this behavior as the compression of personal space in crowded situations, see Sect. 4.

Fig. 2
figure 2

The bandwidth, defined by Eq. (3), assigned to a particular pedestrian A as a function of the distance to another individual pedestrian B. Dotted line: with no other pedestrian present. Solid line (dashed line): with one other pedestrian C (three other pedestrians C, D and E) located at a constant distance of 2 m to A

3.3 Modeling Obstacles

In order to model obstacles, we propose the following additional procedure to be implemented when computing smooth density and flow fields via kernel estimation. Note, however, that we choose not to use this method for the computations presented here since the computational overhead is barely justified for the few obstacles present in the area where we conducted our experiments. Nevertheless, for more constrained geometries—such as a corridor, for example—we expect this method to be of some value. First, define the characteristic function of the experimental area:

$$\displaystyle{\chi (\boldsymbol{x}) = \left \{\begin{array}{@{}l@{\quad }l@{}} 0\quad &\mbox{ if }\boldsymbol{x}\mbox{ is inside an obstacle,}\\ 1\quad &\mbox{ if } \boldsymbol{x}\mbox{ is not inside an obstacle.} \end{array} \right.}$$

Then, define a smoothed characteristic function by convolution with a suitable mollifier:

$$\displaystyle{\chi _{\epsilon }(\boldsymbol{x}):= (\chi {\ast}\psi _{\epsilon })(\boldsymbol{x}) =\int _{{\mathbb{R}}^{2}}\chi (\boldsymbol{x}) \cdot \psi _{\epsilon }(\boldsymbol{y} -\boldsymbol{ x})\,\mathrm{{d}}^{2}y.}$$

For example,

$$\displaystyle{\psi _{\epsilon }(\boldsymbol{x}) = \left \{\begin{array}{@{}l@{\quad }l@{}} \frac{1} {{\epsilon }^{2}} \exp (- \frac{{\epsilon }^{2}} {{\epsilon }^{2}-\|\boldsymbol{{x}\|}^{2}} )\quad &\mbox{ if }\|\boldsymbol{x}\| <\epsilon, \\ 0 \quad &\mbox{ if }\|\boldsymbol{x}\| \geq \epsilon. \end{array} \right.}$$

For each fixed time t and pedestrian i ∈ I, compute (by any method of choice) the function \(k_{i}(t,\boldsymbol{x})\) associated with this pedestrian’s position, and correct this function so that it vanishes inside of obstacles:

$$\displaystyle{k_{i}^{(\mathrm{corr})}(t,\boldsymbol{x}) = \frac{k_{i}(t,\boldsymbol{x}) \cdot \chi _{\epsilon }(\boldsymbol{x})} {\int _{{\mathbb{R}}^{2}}k_{i}(t,\boldsymbol{y}) \cdot \chi _{\epsilon }(\boldsymbol{y})\,\mathrm{{d}}^{2}y}.}$$

Calculate the density ρ and the flow component \(\boldsymbol{j}_{m}\) with these corrected kernel functions. In order to compute the corresponding value for \(\boldsymbol{j}_{c}\), numerically solve the Poisson equation

$$\displaystyle{\bigtriangleup u = \frac{\partial \rho } {\partial t} +\mathrm{ div}(\boldsymbol{j}_{m}),}$$

and define \(\boldsymbol{j}_{c} = -\mathrm{grad}(u)\). In order to solve this equation, one may use, for example, finite-differencing [16, pp. 1024–1030], and choose a constant Dirichlet boundary condition far away from the observation area to enforce uniqueness.

Fig. 3
figure 3

Pedestrian density field at a fixed point in time. Top: variable-bandwidth estimator, bottom: fixed-bandwidth estimator (d i  = 0. 7 m). Max. value: 7. 1 m−2. Red diamonds: group A, blue crosses: group B

Fig. 4
figure 4

Pedestrian flow field at a fixed point in time. Top: variable-bandwidth estimator, bottom: fixed-bandwidth estimator (d i  = 0. 7 m). Blue and red arrows indicate the velocities of individual pedestrians

4 Results and Discussion

Figure 3 illustrates how the variable-bandwidth estimator distributes pedestrian mass to favor densely crowded regions. Probably any sensible macroscopic crowd model provides a mechanism that prevents pedestrian mass to be distributed to regions of already high density since pedestrians generally avoid crowded areas. On the other hand, as opposed to this repulsive short-range action, it is often assumed that there is also an attractive long-range action (similar to the effect of chemotaxis, cf. [17]). Our method of computing densities is in fact consistent with this assumption: Consider two single pedestrians approaching one another. At long distances, they will not regard each other as obstacles since their personal space is still large, and their mass is distributed over a large region. At shorter distances, however, they will avoid entering the personal space of each other since this would lead to a very large local density.

Also, this tendency to highlight pedestrian clusters may prove to be advantageous for (visually) identifying (social) groups in naturally occurring human crowds, the study of which is also an important task; see, for example, [12].

As for the flow computed via the variable-bandwidth method, the additional component \(\boldsymbol{j}_{c}\) arises from the shrinking or expanding of the Gaussians due to the change in distance between pedestrians. In other words, \(\boldsymbol{j}_{c}\) describes the transport of mass via compression of the pedestrians’ personal space in crowded situations. On the other hand, one might be tempted to think of \(\boldsymbol{j}_{m}\) as the “actual” transport of pedestrian mass due to the displacement of pedestrians. However, one has to be careful with this interpretation since the length scale determined by Eq. (3) is to be understood as a free path and not an approximation of the physical size of the pedestrian. Nevertheless, in the context of crowd disasters, we suggest that this distinction between the flow components may be an appropriate way to identify panic situations: Even if regions with large density ρ or flow \(\boldsymbol{j}\) exist, this does not necessarily indicate a (potentially) dangerous situation—imagine, for example, an elevator full of people or a large marathon event, respectively. However, it has been noted that panic situations can be characterized by a number of typical features, a comprehensive list of which is given in [5]. These features include physical interactions between people, clogging and incoordinated movement—all of which are indicated by large changes in density due to the compression of personal space (Fig. 4). In Fig. 3, a comparison of the fixed-bandwidth and variable-bandwidth flow is given by example. The fixed-bandwidth flow appears as a simple superposition of the pedestrian flows A and B. Both computation methods yield a free flow of pedestrian mass exiting the observation area in the region marked (b). In contrast to this, in region (a), the variable-bandwidth estimator shows a sink of pedestrian mass which is due to the compression of personal space when one lane of the bifurcating flow B meets the dominant flow A almost head on.

Figure 5 shows the functions \(\mathrm{div}\boldsymbol{j}\) and \(\mathrm{div}\boldsymbol{j}_{c}\) spatially averaged across a microscopic region, plotted against time. We see that temporal changes in density are more pronounced with the variable-bandwidth method. It can also be noted that the temporal changes of personal space indicated by \(\mathrm{div}\boldsymbol{j}_{c}\) have a larger amplitude when the two pedestrian flows actually meet. However, we acknowledge that this observation is not fully conclusive since the large scatter might also be caused by random measurement errors.

4.1 Comparison with Other Methods

In the following, we would like to compare the different methods mentioned in this work as to their ability to represent the mean density and flow on different spatial scales. To this end, the spatially averaged density and flow is plotted versus the time in Fig. 6, computed via different methods in three regions of different size. One may characterize these regions as microscopic, mesoscopic and macroscopic—although we do not wish to claim that this terminology should generally be applied to regions of the respective size.

Fig. 5
figure 5

Divergence of the flow vs. time, spatially averaged across the microscopic region Ω 1 marked in Fig. 3, | Ω 1 |  = 1 m2. Thin black line: fixed-bandwidth estimator, thick black line: variable-bandwidth estimator, dashed blue line: divergence of the flow component \(\boldsymbol{j}_{c}\)

Fig. 6
figure 6

Spatial average across the rectangular regions marked in Fig. 3. From top to bottom: region with area | Ω 1 |  = 1 m2, | Ω 2 |  = 6 m2 and | Ω 3 |  = 24 m2. Thin black line: fixed-bandwidth estimator, thick black line: variable-bandwidth estimator, thin dashed blue line: standard density/flow, thick dashed green line: Voronoi density/flow. (a) Density vs. time. (b) Flow of pedestrian group A in x-direction vs. time

For the computations, we assumed that the pedestrians stop and cease to move once they exit the area covered by the cameras. This was to avoid outliers in the flow measurement because of incomplete trajectories—cf. the first issue noted in the concluding section of this work. Also, this workaround assures a fairer comparison with the Voronoi method that we implemented in its unmodified form that was originally proposed in [20] and which was not designed for sparse, unconstrained crowds.

Naturally, the methods yield very similar values for the macroscopic region. However, for microscopic regions a significant difference can be noted: the standard density shows large scatter while all three alternative methods appear as smoothed versions of this standard density. Therefore, these methods may be used to reliably compute the density and flow for sparse or dense crowds, and work well on any scale.

While there is not much difference in the total values of the density, the temporal variation, and therefore the divergence of the flow, may show vast differences between fixed and variable bandwidth method as illustrated by Fig. 5.

5 Conclusion

In this paper, we have demonstrated that kernel estimation methods provide an attractive alternative to the standard or the Voronoi method of measuring densities or flows. Also, these kernel methods naturally yield smooth density and flow fields. We argue that estimating such fields from experimental data is helpful in evaluating macroscopic crowd models which yield precisely this type of data. Also, we have shown that even kernel methods with variable bandwidth may be designed in such a way that the resulting density and flow field satisfy the continuum equation.

Moreover, the kernel estimator with variable-bandwidth proposed by us may provide a useful model for changes in personal space, which is also a key idea when formulating the approach based on Voronoi diagrams. These changes in personal space reflect in the dynamic density and flow fields and may be used to effectively visualize effects such as clogging or counter flows.

However, there are some practical and theoretical issues that particularly concern the variable-bandwidth kernel method:

  1. 1.

    In human crowd experiments, trajectories are often incomplete as pedestrians leave the area observed by the cameras. This circumstance usually results in the computation of temporally discontinuous densities and very large flows because of the sudden change in density when pedestrians leave the observation area and “vanish”. Although this problem may potentially present itself with any computation method, the variable-bandwidth kernel method is particularly susceptible to it.

  2. 2.

    It would be preferable to obtain the parameter values λ and p in a data driven, automatical manner—for example, by techniques already known from statistical data analysis. We suggest that one approach to obtain the parameter λ, at least, might be given in [22]. On the other hand, analogies relating probability and pedestrian densities only extend to a certain degree and one might think that these parameter values should be fixed as they are inherent to any crowd configuration.

  3. 3.

    We have to acknowledge that the variable-bandwidth estimator lacks robustness against errors in the measurement of the pedestrians’ trajectories. For example, if the spatio-temporal positions of two different pedestrians were to (almost) coincide due to a measurement error, very large, factually incorrect density values may occur. If no reliable data is available, a workaround would be given by enforcing a lower limit on the bandwidth.

  4. 4.

    In some circumstances, the variable-bandwidth estimator yields large local density values ( > 10 m−2) that—in principle—cannot be predicted by macroscopic models which usually have a restricted range of density values. However, this might only reflect the fact that macroscopic models are designed to describe crowds at large scales.

  5. 5.

    As already noted earlier, the divergence of the flow depends significantly on the type of kernel used for computation, at least when averaged across microscopic regions (see Fig. 5). It also largely depends on the chosen parameters, and the relationship of such graphs to discrete analogues based on the standard density remains unclear.

We conclude by noting that the measuring techniques presented here may be understood as particular smoothing procedures based on the standard definition of a particle density, which do not seem to add any particularly relevant information to the data. In fact, if one is solely interested in the spatial average across macroscopic regions of dense crowds, probably any technique would suffice. However, even an artificial increase in spatio-temporal resolution may prove to be insightful when visualizing data based on sparse pedestrian flows and/or microscopic regions of a crowd, resolving fine-structure that would otherwise remain unseen (cf. [23]).