Keywords

1 Introduction

Crowd or pedestrian density represents important information for assessing crowd situations in certain critical environments. Computerized crowd density information can be estimated through surveillance cameras. However, visual devices frequently suffer from problems on scalability and reliability. Fortunately, mobile devices, such as smartphones or beacons, have become increasingly viable as alternative estimation agents to complement the drawbacks of visual devices [1]. The major problem for such media is the need for people’s voluntary willingness to be tracked. Another problem is that the number of racketeers is far less than the total number of people in a crowd [11]. These problems are challenging in many aspects. This study is the first attempt to propose a modelling framework that continuously estimates the crowd density distribution on the basis of only few RF tracking devices.

Our principle of sensor parsimony is useful for many pervasive applications. Visual approaches suffer from various limitations in spite of the maturity of image processing technology in crowd distribution analysis. Video devices are normally used on a small scale because digital cameras and dedicated supporting infrastructure are expensive and requiring massive deployment to cover a surveillance area. Reliability could also be a problem for visual-based devices in certain critical situation. Surveillance cameras may not function properly in emergency situations, such as escaping from fire, evacuating from earth quake, human safety monitoring, traffic control, and smart guiding in public. For example, the light condition may be insufficient to capture a clear image. The massive data transmission between cameras and servers may also overload the limited communication capacity at emergency time. The recorded image is also unavailable for calculating crowd density for non-emergency usage, because consumers’ personal information is protected by law. Voluntary tracking could be one of the few available choices under privacy law restriction. An alternative approach to acquiring crowd information is through the ever-popular radio-frequency (RF) devices. Smartphones have almost become a must-have for all individuals. The RF devices could be reliable and scalable tracking device if the position information of every pedestrian in target space could be obtained [3].

Fig. 1.
figure 1

Relations among speed, density, and tracking devices. Individual tracking devices are shown in circles and the number of devices is far less than the total number of pedestrians.

The major challenge in applying RF tracking devices is on how to pursue individuals to reveal their private information. Wireless users only share their private information on a voluntary basis. Only a small fraction of pedestrians willingly reveal their location information. Estimate that only under a hundred devices registering their locations in a particular space among thousands of people is achievable. We tackle the challenging problem by superimposing multiple dynamical models. If people in a space do not move, most non-tracked individual’s locations are difficult to know from few tracked devices and, in such situation, may need to resort to other auxiliary methods, such as acoustic Doppler waves, if necessary. However, the more they move the more information we have and the density estimation becomes more accurate [5,6,7]. The two upper pictures in Fig. 1, show scant possibilities in obtaining density information through few persons with RF devices in a highly sparse or crowded situation. However, in the two lower pictures in Fig. 1, the persons wearing RF tracking devices are represented by circles and the relative movement can help to estimate the crowd density.

The model we first included in the framework are cellular automata (CA) models, ferromagnetic models, social force models, and complexity models. We apply Markov Chain Monte Carlo (MCMC) and particle swarm optimization (PSO) for state parameter estimation. The parameters estimated from multiple models are fused by data assimilation (DA) and a continuous ROC estimator, which robustly relaxes the maximum a posteriori alue in the optimization algorithm. This study proposes a modelling framework that can connect real-world situations to mathematical models. The scenario simulation provided by the framework can also increase the specificity of modelling process.

2 Modelling Framework

To address the complicated modelling problem for the real-world pedestrian behavior, we provide a new and efficient means and fill the gap between real-world and theoretical problems.

2.1 General Concept

We propose a three-level framework that performs computations on an abstract-model level over a unified space, collects estimations on a model level over an algorithmic space, and converts real-world signals to an application levels over a feature space.

Applications at the top level manifest a property of variety in appearance, which make the underlying algorithms non-reusable across different kinds of real-world situations. The framework exploits a conversion that maps versatile application domains to a standard feature space using suitable kernel functions. The process of feature extraction pertains to a mapping from the space of input data \({\mathbb R}^N\) to the space of feature . For example, a typical sound signal contains ten thousands of samples in a second, and directly performing an algorithm on this high-dimension is difficult. We use Fourier kernels to decompose this signal from a time domain into a frequency domain. We then assign a set of significant frequency components to a vector in a low-dimension feature space. In the application of pedestrian flow, some algorithms directly take pedestrians’ sampled positions and velocities as input while some algorithms need to transform the detailed information into an aggregate quantity, such as average speed or oscillation frequency. Functions in this level mainly provide common input data in case the algorithms entail the same information.

2.2 Framework in Model Level

We attempt to extend the estimation by superimposing or fusing as many known relations as possible to tackle the ill-posed problem in crowd density estimation from limited observations [4]. Figure 2 shows our attempt to incorporate multiple relations in this model level. Each model embodies the relationship of internal states to observations. To isolate changing applications, we do not specify the states and observations. We may later assign them to density distribution \(\rho (x,y,t)\) and velocity field \(\mathbf {v}(x,y,t)\) for coordinates (xy) at time t.

Fig. 2.
figure 2

Models superpositioning in our fused assimilation framework

Given a model with known states and observations, the model parameters still possess a large degree of freedom. Each set of model parameters forms a subspace. Figure 2 shows that smaller intersection of such subspaces result in higher likelihood that a unique parameters set could be determined. We apply MCMC and PSO to determine the unknown parameters. The states can be estimated accurately if another set of observations can be acquired through alternate means (Fig. 3). For example, pedestrian velocity can be obtained through RF devices, surveillance cameras, or Doppler microwaves. The probability of obtaining a unique solution thus increases.

Fig. 3.
figure 3

Fusion through distinct observations

Various crowd models can be instantiated in this level. Several main assumptions distinguish the models [9, 12]. The major difference among crowd models is at the modeling scale. In the microscopic scale, approaches involve CA, agent-based method (ABM), and Markov random field (MRF). In the macroscopic scale, fundamental diagram, social force, fluid dynamics, and game theory exist. Some models assume homogeneous behaviors among pedestrians, whereas others allow heterogeneous behaviors. Models can also build on discrete or continuous space and time. In terms of psychological response, normal or emergency situations account for another important modelling attribute. Models of pedestrian dynamics can build on normal and emergency situations, and their moving patterns can flow through various topological spaces.

Each pedestrian in MRF models is considered as a cell on a grid. These models mainly use means of probability and statistics to study crowd behavior. Ferromagnetic models are a kind of MRF and originated from the quantum mechanical spinning of electrons. A small magnetic dipole moment is associated with the spin. Thus, the spin can be represented by 1 when pointing upward and \(-1\) when pointing downward. A highly popular ferromagnetic model is the Ising model.

Let \(\varLambda =\{j\in {\mathbb Z}:|j|\le N\}\) be a symmetric finite hypercube on integer set \({\mathbb Z}\). The configuration space is the set \(\varOmega _\varLambda \) of all sequence \(\omega =\{\omega _j\}_{j\in \varLambda }\), i.e., \(\varOmega _\varLambda \) \(=\) \(\{-1,1\}^\varLambda \). We denote the set of Borel \(\sigma \)-field by . Let \(\rho \) be the measure \(\frac{1}{2}\delta _{-1} + \frac{1}{2}\delta _1\) and \(\pi _\varLambda \mu _\rho \) be the product measure on with identical one-dimensional marginal \(\rho \). The \(\rho \) satisfies \(\pi _\varLambda \mu _\rho \{F\} = \mu _\rho \{\pi _\varLambda ^{-1}F\}\) for all . For each \(\omega \in \varOmega _\varLambda \), \(\pi _\varLambda \mu _\rho \{\omega \} = 2^{-|\varLambda |}\), where \(|\varLambda |=2N+1\) for this \({\mathbb Z}\) case.

The coordinate mappings on \(\varOmega _\varLambda \), defined by \(Y_j(\omega )=\omega _j\), are called the spin random variables at the site j. The Hamiltonian or interaction energy of a spin configuration \(\omega \in \varOmega _\varLambda \) is defined as \( H_{\varLambda ,h}(\omega ) = -\frac{1}{2}\sum _{i,j\in \varLambda } \varPhi (\{i,j\})\omega _i\omega _j - h\sum _{j\in \varLambda }\omega _j \,\). We call \(\varPhi \) a ferromagnetic interaction potential and assume that \(\varPhi \) is a non-negative function on \({\mathbb Z}\) which is symmetric and translation invariant. The corresponding potential \(V(\cdot )=\sum _{\varLambda }\varPhi (\cdot )\). The parameter h is a real number that gives the strength of an external magnetic field that acts at each site in \(\varLambda \).

Let \(\beta =1/T>0\) be the inverse absolute temperature. The ferromagnetic model is defined by the probability measure \(\mu _{\varLambda ,\beta ,h}\) on as follows:

$$\begin{aligned} \mu _{\varLambda ,\beta ,h}\{\omega \} = \exp \left[ -\beta H_{\varLambda ,h}(\omega ) \right] \pi _\varLambda \mu _\rho \{\omega \} \cdot \frac{1}{Z(\varLambda ,\beta ,h)} \,, \end{aligned}$$
(1)

where the partition function is defined as \( Z(\varLambda ,\beta ,h) = \int _{\varOmega _\varLambda }\exp [-\beta H_{\varLambda ,h}(\omega )] \pi _\varLambda \mu _\rho d\omega \) \(= \sum _{\omega \in \varOmega _\varLambda }\exp [-\beta H_{\varLambda ,h}(\omega )] \frac{1}{2^{|\varLambda |}} \,\). The measure \(\mu _{\varLambda ,\beta ,h}\) is called a finite-volume Gibbs state on \(\varLambda \).

A probability measure \(\mu \) is defined on \(\varOmega _\varLambda \) with strictly positive values for finite cylinder sets. The conditional probabilities of the form \(\mu [\omega (x)=1 | \,\omega (\cdot ) \text{ on } \varLambda \backslash x]\) depend only on the values of \(\omega \) at the neighbors of x and are invariant under graph isomorphism. The set of all MRF’s is denoted by . An MRF is an infinite Gibbs state with a homogeneous nearest neighbor pair potential \(\varPhi \) and vice versa [8]. Let \(0<p<1\), and \(0<q<1\). We consider the matrix \(M=\left[ \begin{array}{cc} p &{} 1-p \\ 1-q &{} q \end{array} \right] \) as the transition matrix of the Markov chain with two states. Let \(\pi =\{\pi (-1),\pi (1)\}\) be the unique stationary distribution (i.e., \(\pi M=\pi \)).

In order to predict the crowd density inside a topology \(\varLambda \), i.e., a corridor, we need to determine two parameters from the Ising ferromagnetic model: temperature T and external intensity h. Assume that the time dependent temperature distribution over the corridor is given in the sense of average. The crowd density can be modeled as a stochastic process, where each random variable depends on its vicinity. The probability measure of crowd density is obtained at a certain point and time if a finite number of equilibrium states exist. We can then evaluate the density distribution and provide an index for the cost of minimum cost algorithm Based on the temperature information.

We need the conditional probability of each spot for the calculation of the nearest neighbor pair potential. Obtaining the pedestrian count with infinitely high resolution is impossible in the real world application. We can only estimate the nearest neighbor pair potential from the boundary condition, i.e., RF tracking devices or smartphone apps installed on few volunteers.

The macroscopic crowd analysis can roughly start from a relationship between density and speed. We refer to density as an overall average scalar and speed as a non-directional quantity in terms of the entire flock of crowd to achieve a loose and simple analysis. We will also discuss models that facilitate vector velocities and density distribution for the x-y positions and time course.

Fundamental diagram basically characterizes a speed-density relation [2]. Pedestrian speed decreases as the density increases When crowd density falls within a certain range. The relation is achieved because neighboring pedestrians may influence each other on the moving velocity [7]. Free motion should be maintained without interference when the density is sufficiently low at a normal situation. The relation is as follows:

$$\begin{aligned} v(\rho ) = \rho _0 \left\{ 1-\exp \left[ -\gamma \left( \frac{1}{\rho }-\frac{1}{\rho _{max}}\right) \right] \right\} , \end{aligned}$$
(2)

where \(v_0=1.34\) m/s is about the free walking speed at flow density, \(\rho _{max}\) is the maximal crowd density that completely blocks human movement, and \(\gamma \) is the scaling constant. The relation is also suitable to fit into empirical data because of the simplicity of relation.

Crowd density is an influencing factor of pedestrian speed and can be uniquely derived from speed only under certain special conditions. In general case, crowd flocking patterns, obstacles, walking conditions, as well as demographics and cultural aspects significantly affect the fundamental diagram.

The social force model is a macro-level model in which pedestrians follow a rule of social force to perform the interaction. The model mimics the force field in the physical world. The model also assumes that the acceleration of pedestrian depends on the superimposition of personal (pers), social (soc), and physical (phys) force fields. That is, \(\mathbf {F}^{(pers)}\), \(\mathbf {F}^{(soc)}\), and \(\mathbf {F}^{(phys)}\) [10, 13]. For a pedestrian of mass \(m_j\) with velocity \(\mathbf {v}_j\), the basic equation of motion is given as

$$\begin{aligned} \frac{d \mathbf {v}_j}{d t} = \mathbf {f}^{(pers)}_j + \mathbf {f}^{(soc)}_j +\mathbf {f}^{(phys)}_j, \end{aligned}$$
(3)

where \(\mathbf {f}^{(\cdot )}_j =\frac{1}{m_j} \mathbf {F}^{(\cdot )}_j\) is the specific force. The force pertaining to social interaction is defined as \(\mathbf {f}^{(soc)}_j = \sum _{l\ne j} \mathbf {f}^{(\cdot )}_{jl}\), which represents the specific forces due to other pedestrians. There is another force \(\mathbf {f}^{(pers)}_j\) keeping pedestrian j moving on its own preferred velocity \(\mathbf {v}_j(0)\) and is defined as \(\mathbf {f}^{(pers)}_j = \frac{\mathbf {v}_j(0) -\mathbf {v}_j}{\tau _j}\) for an acceleration time \(\tau _j\).

The social force analogizes the territorial effect on the private sphere. In psychology, people feel uncomfortable when strangers enter their private spheres. Thus, a repulsive force emerges to separate people. An exponential form is assumed for easy quantification of the forces. The force between person j and l is given as

$$\begin{aligned} \mathbf {f}^{(soc)}_{jl} = A_j \exp \left[ \frac{R_{jl}-\varDelta r_{jl}}{\xi _j}\right] \mathbf {n}_{jl}, \end{aligned}$$
(4)

where pedestrians possess disks of radius \(R_{jl}\) and distance \(r_{jl}\). The normal vector \(\mathbf {n}_{jl}\) is a vector that points from j to l, representing the direction of \(\mathbf {f}^{(soc)}_{jl}\). The quantity \(A_j\) is a scaling factor, and \(\xi _j\) is the range of the interactions.

3 Conclusions

On the basis of few RF tracking devices, this study proposes a modelling framework to continuously estimate crowd density distributions, provided that vision analysis is unavailable and the speed-density relation specified by the fundamental diagram is at least maintained. The designated events to drive the flock movement include tour walking (shopping or museum), spontaneous walking (subway or bus station), and ordered/unordered evacuation (drill/fire). The crowd flow forming patterns included in the modelling framework were jamming on the bottleneck, congestion, or stairs, shock wave, oscillation, lane formation, intersections, competition, and various irrational phenomena (panic, herding, and stampede).

The applications of this framework are suitable for the situations where the principle of sensor parsimony is prominent. The estimated crowd density obtained by the modelling framework can be used for commercial usage at normal times, for evacuation commanding at emergency times, and for building/facility layout improvement during design times.

This study proposes a modelling framework that can connect real-world situations to mathematical models. The scenario simulation provided by the framework can also make the modelling process precise. On our next paper, detailed parameter calibration algorithms will be presented. Pedestrian behaviors will be investigated empirically on the next stage.